New Problems in Software Complexity
Abstract
Complexity is a major problem in modern simulation software.  Users always
want more features, thus making both the implementation and the mastering
of software constantly more difficult.  Language designers have estimated
that a good computer language should have less than 100 keywords. If it has
more, the compiler becomes large and clumsy; the limited acceptance of
PL/I illustrates the fate of languages with large, clumsy compilers.  The
user's manual for a language should be less than 100 pages long to allow
the average user to master all its features. Clearly, these are serious
limitations for simulation software, since the number of required keywords
is dictated by the complexity of the underlying tasks rather than by
the wishes of the language designer. 
How can we overcome this problem? I believe one of the keys lies in
separating data management from the language itself.  A simulation system
should have a data base management system (DBMS) adapted to the specific
needs of simulation.  Standridge has described such a DBMS. Programs that
perform different parts of a system analysis may then be implemented
independently; they can communicate through the	data base.  This approach
has many advantages; let me begin with some that are unique to simulation:		
- Ability to combine outputs from several runs (sometimes called an
    overplot). Many simulation languages, such as CSMP-III, offer this
    feature.  However, it becomes natural when all results are saved in a
    data base.  A separate postprocessor can then readily retrieve and
    display data, regardless of which specific run created it. The
    postprocessor communicates with the simulation program only through
    the data base.
 - Ability to combine simulated and experimental results. All that is
    necessary is a real time data acquisition program that stores its
    results in the data base. Although this combination would greatly
    simplify model validation, it does not exist in any current simulation
    language of which I am aware.
	
 - Dynamic loading of tables. The simulationist need not enter tabular
    data directly into the simulation program (e.g., using a CSMP FUNCTION
    statement).  Instead, the tables are simply stored in the data base.
    This data may be user generated, generated by other simulation runs, or
    even generated by real measurements. For example, this feature would
    greatly simplify the solution of the finite-time Riccati differential
    equation.  This equation must be computed backward in time, while the
    system equations must then be computed forward using the backward
    solution.
 - Statistical analysis of noisy data. One often would like to analyze
    stochastic results statistically. It is natural to store them in the
    data base and let an independent statistical analysis program (e.g.,
    SAS or SPSS) perform the analysis.  The simulation analyst then need
    not duplicate the functions of well established packages.
 - Range analysis. With stochastic models, one often wishes to display
    the range of the results. Managers generally prefer this representation,
    since it allows them to see trends and confidence limits that are
    difficult to express.  Again, a postprocessor can perform this kind of
    analysis through the data base without involving the simulation
    language at all.
 - Actual storage of models, parameter values, experimental frames, and
    other information related to the simulation project.  Modeling projects,
    like other projects, have their own internal data base; this may include
    partial models, models with different levels of detail or different
    orientations, validation tests, and base cases. Usually the analyst
    keeps this information in a file cabinet or in a stack of old runs.
    Logically, nowadays, like personnel files or ac counting records, it
    belongs in a data base system. Oren and Zeigler ave discussed the
    automation of model management.'
 
				
Another advantage of this approach is that the independent modules can have
separate manuals. A manager may then, for example, study only the
postprocessors manual, since he or she will use the computer only to display
data produced by other people's programs. The approach also allows a natural
division of tasks, rather than requiring people to write programs that must
communicate with each other directly. 
 
Interested in reading the
    
       full paper?
(2 pages, 172,853 bytes, pdf) 
 
   Homepage 
Last modified: January 12, 2006 -- © François Cellier