New Problems in Software Complexity

François E. Cellier
Institute of Computational Science
ETH Zürich
CH-8092 Zürich
Switzerland
Email: FCellier@Inf.ETHZ.CH

Abstract

Complexity is a major problem in modern simulation software. Users always want more features, thus making both the implementation and the mastering of software constantly more difficult. Language designers have estimated that a good computer language should have less than 100 keywords. If it has more, the compiler becomes large and clumsy; the limited acceptance of PL/I illustrates the fate of languages with large, clumsy compilers. The user's manual for a language should be less than 100 pages long to allow the average user to master all its features. Clearly, these are serious limitations for simulation software, since the number of required keywords is dictated by the complexity of the underlying tasks rather than by the wishes of the language designer.

How can we overcome this problem? I believe one of the keys lies in separating data management from the language itself. A simulation system should have a data base management system (DBMS) adapted to the specific needs of simulation. Standridge has described such a DBMS. Programs that perform different parts of a system analysis may then be implemented independently; they can communicate through the data base. This approach has many advantages; let me begin with some that are unique to simulation:

Ability to combine outputs from several runs (sometimes called an overplot). Many simulation languages, such as CSMP-III, offer this feature. However, it becomes natural when all results are saved in a data base. A separate postprocessor can then readily retrieve and display data, regardless of which specific run created it. The postprocessor communicates with the simulation program only through the data base.
Ability to combine simulated and experimental results. All that is necessary is a real time data acquisition program that stores its results in the data base. Although this combination would greatly simplify model validation, it does not exist in any current simulation language of which I am aware.
Dynamic loading of tables. The simulationist need not enter tabular data directly into the simulation program (e.g., using a CSMP FUNCTION statement). Instead, the tables are simply stored in the data base. This data may be user generated, generated by other simulation runs, or even generated by real measurements. For example, this feature would greatly simplify the solution of the finite-time Riccati differential equation. This equation must be computed backward in time, while the system equations must then be computed forward using the backward solution.
Statistical analysis of noisy data. One often would like to analyze stochastic results statistically. It is natural to store them in the data base and let an independent statistical analysis program (e.g., SAS or SPSS) perform the analysis. The simulation analyst then need not duplicate the functions of well established packages.
Range analysis. With stochastic models, one often wishes to display the range of the results. Managers generally prefer this representation, since it allows them to see trends and confidence limits that are difficult to express. Again, a postprocessor can perform this kind of analysis through the data base without involving the simulation language at all.
Actual storage of models, parameter values, experimental frames, and other information related to the simulation project. Modeling projects, like other projects, have their own internal data base; this may include partial models, models with different levels of detail or different orientations, validation tests, and base cases. Usually the analyst keeps this information in a file cabinet or in a stack of old runs. Logically, nowadays, like personnel files or ac counting records, it belongs in a data base system. Oren and Zeigler ave discussed the automation of model management.'

Another advantage of this approach is that the independent modules can have separate manuals. A manager may then, for example, study only the postprocessors manual, since he or she will use the computer only to display data produced by other people's programs. The approach also allows a natural division of tasks, rather than requiring people to write programs that must communicate with each other directly.

Interested in reading the full paper? (2 pages, 172,853 bytes, pdf)

Homepage