Writing Array-Abstractable Definitions
This article was written April 2003 by Lonnie Chrisman, Ph.D., of Lumina Decision Systems, updated Nov 2005, and originally in a PDF format. It was transferred to the Analytica Wiki Feb 2010.
The Benefits of Array Abstraction
One of the most powerful and useful features of Analytica is Intelligent Array Abstraction™.
When we give the variable Profit the definition Revenue-Expenses
, Analytica automatically
carries through any dimensions of Revenue and Expenses,
seamlessly making the correspondence if the same
dimension occurs in each. If the Revenue and Expenses variables are broken down by (i.e.,
indexed by) department and project, the resulting Profit will also be broken down across these
two dimensions. A conventional programming language would force the modeler to explicitly
loop over these dimensions when computing Profit, and a spreadsheet would require the modeler
to explicitly copy the same formula to every cell in the two-dimensional grid. This manual
overhead imposed on the modeler in those environments is distracting, and fundamentally
unnecessary, since the relationship between Profit, Revenue and Expenses is entirely separate
from whatever the dimensions happen to be in a particular model. This principle of separating
the fundamental relationships between quantities (variables) and their dimensionality is what we
call (Intelligent) Array Abstraction.
Once one has mastered the basics of array abstraction, it is hard to imagine going back to a more primitive modeling environment (such as a spreadsheet or standard programming language) without such a capability for any non-trivial modeling application. Array abstraction provides a flexibility that goes hand-in-hand with the modeling process. When we begin prototyping a model, the optimum dimensions to include are seldom clear. The modeling process usually involves the elucidation of relevant variables, specification of fundamental relationships, and experimentation with tradeoffs between richness of knowledge to build off of, computational complexity, and levels of detail. The determination of the appropriate dimensionality for key variables is a task that usually falls most naturally after elucidation of relationships between variables. Array abstraction allows the modeler to do just this – once an abstractable model is in place, dimensions can be added and removed with little effort. This can be contrasted with spreadsheets and conventional programming languages where one must commit to the dimensionality of variables early in the modeling process, and because every computational step involves explicit defined iteration over the specified dimensions, altering the dimensionality of variables tends to be extremely tedious.
Array abstraction also promotes correctness. Opportunities to make errors when copying cell references in a spreadsheet abound, are quite common in practice and are hard to verify; likewise, it is quite easy to make mistakes when writing explicit looping constructs in conventional programming languages, such as accidentally confusing the correspondence between dimensions. Separated from the clutter of all these looping constructs, the clarity of variable definitions in Analytica is far more apparent. Furthermore, array abstraction leaves one with a sense of confidence that solid portions of the model are and will remain correct, regardless of other modifications or dimensional adjustments that may be made outside of that sub-model. In Analytica, array abstraction enables a number of other important capabilities. A major one of those is the uncertainty analysis that is built into Analytica. Uncertainty analysis in Analytica operates by adding a dimension to the uncertain variables in your model, named the Run dimension. Since your (presumably abstractable) model describes the relationships between variables, what is true without that dimension should also be true with that dimension, and thus Analytica can seamlessly propagate that dimension through the computations, even though as a modeler, you might not have conceptualized your variables up front as being dimensioned by Run.
Another important capability arising from array abstraction is What If analysis. Adding new dimensions to key input variables enables one to simultaneously explore and compare multiple scenarios, without having the logic of your model. “What-if” analyses are also often the bases decision making by exploring spaces of possible decisions.
Non-abstractable Definitions
Most Analytica expressions (relationships) that one would write in a variable’s definition field will be “abstractable”, that is, it will be possible to arbitrarily add and adjust dimensions to input variables without altering the correctness of the result. Analytica will propagate any of these dimensions through the variables without any extra thought on the part of the modeler. However, there are some notable exceptions, where it is possible to write a definition that is not abstractable. When such a definition is written, the flexibility and benefits of array abstraction may be lost. A key area of the more advanced Analytica modeling skill set is being able to recognize such cases and to know how to “correctly” express the desired relationships in alternative, abstractable ways.
Expressions that are limited in their ability to be array abstracted often fall into one of three groups: Expressions that introduce new dimensions, dimension-reducing expressions that that don’t name the relevant dimension, and convergence/iterative/recursive algorithms where the total number of iterations depends on the computation itself and cannot be known in advance. Relationships falling in these categories can almost always be written in an array abstractable fashion if the modeler pays appropriate attention to how the expression is written.
An example of a non-abstractable expression in the first group of functions that introduce a new dimension is the Sequence function, e.g., 1..N or Sequence(1,N). When N is “atomic” (i.e., not an array, containing no dimensions), this expression evaluates just fine, but N is changed to a list of numbers, e.g., [10,20,30], the result would be a ”non-rectangular” array, which is not allowed as an Analytica value. The presence of a Sequence function embedded in a more complex expression could make the entire expression non-abstractable if appropriate modeling style is not exercised, as in the following example of an expression to compute the factorial of a number N:
Product(1..N)
Although there is no inherent reason why the factorial of a number should not be abstractable, this particular way of writing the factorial would not be abstractable. Other functions also falling into the first group of expressions that (may) introduce dimensions include Subset, SplitText, SortIndex (if the second parameter is omitted), Concat, CopyIndex, IndexNames, and Unique.
An example from the second group of expressions, dimension-reducing expressions not naming the relevant dimension, is the expression: Sum(A), where the second index parameter of Sum is omitted. This form sums over the “outer” dimension of A. As a modeler, there are generally only a couple of instances where you know what the outer dimension is – the case where A is guaranteed to be only one-dimensional, and the case where A contains an implicit dimension (which will always be the outer dimension). If you assume any other dimension is the outer dimension, you may be unpleasantly surprised when new dimensions are introduced. Thus, as a matter of style, it is best to always include the relevant dimension as a second parameter to the array reducing functions in all other cases, and you’ll steer clear with abstractability problems from this class of expressions. The array reducing functions with optional index parameters include Sum, Product, Max, Min, Average, Rank, ArgMax, ArgMin, SubIndex, ChanceDist, CumDist, and ProbDist. Also, the function Size is essentially also in this category, although there is no concept of relevant index there – basically, Size should only be used on parameters restricted to be one-dimensional, index parameters being a very important special case.
Finally, the third class of expressions are iterative or recursive convergence algorithms with a dynamically determined number of iterations. This group almost always involves While or Iterate functions. For example, another non-abstractable expression for factorial of an integer N is:
Var a := 1; Var fact := 1; While a < N Do ( a := a + 1; fact := fact * a )
As written, this expression assumes N to be atomic, and will not evaluate if N is an array. This expression will be discussed further below when discussing Horizontal and Vertical Array Abstraction. The fundamental problem with abstracting over a While is that the number of iterations may differ from cell-to-cell, so a single evaluation of the While loop does not account for the cell-by-cell differences in evaluation flow. It is important to realize that if you are iterating over a dimension, either implicitly or via For or Using..In..Do, because the number of iterations required is fixed in advanced, abstraction limitations generally are not introduced.
Enable comment auto-refresher