As with any new research technique, there are methodological difficulties connected with the efficient utilization of computer simulation. There are three basic classes of problems that arise in using computer models — the specification of functional forms, the estimation of parameters, and the validation of the models.
The problem of specifying functional forms is literally an example of the “embarrassment of riches.” Most mathematical models have been formulated in terms of linear equations in order to facilitate the attainment of analytic solutions. Since this restriction is unnecessary for computer models, the way is opened for nonlinear functions having a wide variety of forms. The solution to this problem will probably come from two sources. First, as our empirical information (the collection of which will be stimulated and guided by attempts to formulate computer models) increases, some clues as to the proper forms to use to explain and predict behavior will be available. Second, technical statistical criteria will be developed to select efficiently the proper forms of the equations, presumably on the basis of predictive power.
The problems of parameter estimation have, of course, been much discussed in statistical and econometric literature. A major advance has been the proof that unbiased and efficient estimates can be obtained only by acknowledging the simultaneity of the equations of a model. If this result carries over to computer process models, obtaining maximum likelihood estimates of all the parameters in such models will be a forbidding task.
A more feasible approach to the parameter estimation problem may be to restrict attention to the joint determination of only the current endogenous variables within a single period and to consider that the values of the lagged endogenous variables are subject to errors. The parameter estimation problem must then be considered within the framework of an “errors in the variables” model rather than an “errors in the equations” model. A few econometricians have investigated this kind of estimation problem, and their results may prove applicable to computer models.
The likelihood that a process model will incorrectly describe the world is high, because it makes some strong assertions about the nature of the world. There are various degrees by which any model can fail to describe the world, however, so it is meaningful to say that some models are more adequate descriptions of reality than others. Some criteria must be devised to indicate when the time paths generated by a process model agree sufficiently with the observed time paths so that the agreement cannot be attributed to mere coincidence. Tests must be devised for the “goodness of fit” of process models with the real world. The problem of model validation becomes even more difficult if available data about the “actual” behavior of the world are themselves subject to error.
Although the formal details have not yet been adequately developed, there appear to be at least three possible ways in which the validation problem for process models can be approached:
- Distribution-free statistical methods can be used to test whether the actual and the generated time series display similar timing and amplitude characteristics.
- Simple regressions of the generated series as functions of the actual series can be computed, and then we can test whether the resulting regression equations have intercepts that are not significantly different from zero and slopes that are not significantly different from unity.
- We can perform a factor analysis on the set of generated time paths and a second factor analysis on the set of observed time paths, and we can test whether the two groups of factor loadings are significantly different from each other.
Source: Skyttner Lars (2006), General Systems Theory: Problems, Perspectives, Practice, Wspc, 2nd Edition.