Subscript-Slice Operator

Revision as of 22:49, 6 March 2010 by Max (talk | contribs)


The subscript operator X[I = v] returns the element, or more generally, slice of array X for which index I is equal to value v.

The slice operator X[@I = n] returns the slice of array X for the nth element of index I. The subscript operator uses associational indexing, identifying the value of index I by association with its value or label. The slice operator uses positional indexing, identifying the slice by its position in index I.

If no element of I matches the value of v for subscript, it returns NULL and a warning message (unless turned off). A slice does the same if n is not an integer between 1 (the first element of I) and the size of I.

The principle of implicit indexing means that each value X, if not explicitly an array indexed by I, is assumed to have the same value for each value of I. Thus, if X is not indexed by I, X[I=v] or X[@I=n] are "no ops" returning the value of X (provided v is an element of I).

The functions Subscript(x, I, v) and Slice(x, I, v) are exactly equivalent to x[I=v] and x[@I=v] respectively,


The first element X can be any expression, for example

Sum(Y, J)[K = 2]

The value element V can also be an expression. It can also return an array of values:

INDEX I := ['A', 'B', 'C', 'D']
VAR X := Table(I)(100, 200, 250, 300]
INDEX J := ['D', 'C']
X[I = J] -> Array(J, [300, 250])

You can use this scheme to change an array from one index to another that contains the same or a subset of the same values.

Description

The slice and subscript operations are among the most powerful and most often used operators in the Analytica expression language, and because of this, Analytica provides a special concise syntax for these. The subscript operator is equivalent to calling the Subscript function, i.e., the following two expressions are exactly equivalent:

A[I=x]  
Subscript(A,I,x)

and likewise, the subscript operator is equivalent to calling the Slice function, i.e., these two are identical:

A[@I=n]
Slice(A,I,n)

The subscript operator A[I=x] returns the slice of array A corresponding to the (first) index value equal to x. This is associational indexing -- selecting the slice by value.

The dual slice operator, A[@I=n], returns the nth slice of A along index I. This is positional indexing.

In both cases, if the original array is indexed by I, the result will have one less dimension (the dimension I being removed). When either operator is applied to an array not indexed by I, the result will be the same. (In Analytica, an array not indexed by I is considered equivalent to an array indexed by I but constant over that index).

The slice and subscript operators can be applied in succession -- to slice down across more than one index. Two equivalent syntaxes are possible, and in both, slice and subscript operators can be intermixed, for example:

expr[I=x][@J=n]
expr[I=x,@J=n]

Interaction with Array Abstraction

When x or n is atomic, the slice/subscript operations usually reduce the dimensionality of the array by one dimension (the exception being when the index is not a dimension of the original value). However, in general, x or n can be arbitrary expressions, and the result of x or n may be array-valued. In general, dimensions appearing in x or n will appear in the result, so the dimensionality may actually increase as a result of applying the slice or subscript operators. As array abstraction operates over the x and n parameters, the slice and subscript operators offer extremely general and powerful lookup operators. Fairly complex operations equivalent to re-indexing, VLookup operations in spreadsheets, outer-joins in relational databases, and others are achieved quite simply and directly using subscript or slice operations. These operators are also used for sorting or re-ordering arrays, filtering rows, and other operations. This flexibility relieves Analytica from having to have a plethora of lookup functions often found in many other languages. However, mastering the full power of Slice and Subscript operators may take some time.

Re-indexing

Re-indexing is a common operation -- replace one index of an array by another index. If indexes I and J have the same elements in the same order with no duplicates, which might arise if J is defined as:

  J := CopyIndex(I)

this simple expression re-indexes as you want, if array A is indexed by I:

 A[I=J]

The result is identical to A, except that it is indexed by J instead of I.

You can do the same using the positional operator:

 A[@I=@J]

The advantage of this is that it works if I and J are the same length but have different elements, or if the indexes contain duplicate elements.

Here is an example, to compute the outer-product of a vector, V indexed by I with its transpose:

Index J := CopyIndex(I)  -- creates J with identical length and values to I.
V * V[I=J]

The result is the outer product, with dimensions IxJ.

Re-ordering

The slice and subscript operators can be used to re-order an array in various scenarios. Often, you will have an index J, which is a permutation of index I. In this case, a re-indexing re-orders the elements of A, using just A[I=J].

A common example of this is sorting. Suppose Row and Col are both indexes of array A, and you want to sort on the column A[Col='ROI']. Here you would create a new index, SortedRow, defined as

SortIndex( A[Col='ROI'] )

and then compute the sorted array using A[Row=SortedRow].

(To do: Discuss a positional dual to SortIndex. See the Rank function. There are some caveats to positional sorting).

Another example of re-ordering is the reversing of elements. In Analytica, the Dynamic function computes starting from the beginning of Time. In many dynamic programming applications, we would like to start from the final time point and work our way back. Often the way this is done is by reversing the the array so that we can use Dynamic, and then reversing the result once computed. This is one example where reversing an array is useful. To reverse an array, the Slice operator is the most convenient:

A[@I=size(I)-@I+1]

Shifting left or right along a given index is also fairly common, and can also be achieved fairly directly using the Slice operator. The basic form of a shift-left or shift-right is A[@I=@I-1] or A[@I=@I+1] respectively, but there is an additional consideration regarding what value to use for the rightmost or leftmost element of the result. This consideration is covered in the "Out of range conditions" section below.

Filtering

Filtering extracts a subset of "rows" from an array along a given dimension. If we want a subset of "rows" along dimension I, we need to introduce (or compute) a separate index, say I2, containing a subset of the elements of I. This may be computed using the Subset function, for example. For example, to obtain the subset of people younger than 30, a subset of the People index, we might define the index YoungPeople as:

Subset( PersonData[Trait='Age'] < 30 )

and then obtain the filtered data set as:

PersonData[ People=YoungPeople ]

(to do: Insert screen shot of original array and result array for this example)

Multi-step lookup (outer-join)

Often we have multiple arrays containing different types of data, and we need to join the tables to compute a trait across a different dimension. Consider the following example, where we have two arrays, Salary_by_profession and Profession_by_person, indexed by Profession and Person respectively as follows:

Profession Salary_by_profession
'Dock loader' $45,000
'Crane operator' $75,000
'Forklift driver' $32,000
...
      
Person Profession_by_person
'Joe Smith' 'Crane operator'
'Mark Jones' 'Forklift driver'
'Greg Johnson' 'Forklift driver'
...

Given these arrays, suppose we want to compute Salary_by_person. Here you want to use the profession of a person to subscript into the Salary_by_profession array. In database terminology this is termed an outer-join. In Excel, this is called a VLookup. In Analytica, the expression to compute this is

Salary_by_profession[ Profession = Profession_by_person ]

The result is:

Person Salary_by_person
'Joe Smith' $75,000
'Mark Jones' $32,000
'Greg Johnson $32,000
...

Out of Range conditions

When A[I=x] or A[@I=n] is evaluated when x is not an element of I, or n is not a valid position of I (e.g., n<0 or n>size(I)), then you have an out-of-range. When the result does not impact the the final result, no warning occurs. An example of this is the expression (this performs a "shift-left" operation):

 if @I+1>size(I) then 0 else A[@I=@I+1]

If n does effect the final result, and the "Show Result Warnings" preference is on, then a warning is issued. If the warning is ignored, then Null is returned. In the following example, no warning appears to the user, even though Null is returned, because the out-of-range condition does not show up in the final result (this expression performs a "shift right"):

 var v := A[@I=@I-1];
 if v=Null then 0 else v

A second method for allowing for intentional out-of-range result is to surround the expression with the IgnoreWarnings function. Here is an alternative method for "shift-right" - the 1st position will have the value Null:

 IgnoreWarnings( A[@I=@I-1] )

For computational performance, wrapping the expression with the IgnoreWarnings function evaluates more rapidly, as a result of relieving Analytica from having to track whether the out-of-range condition impacts the final result.

Slice Assignment

(new to 4.0)

Analytica 4.0 allows you to assign a value to a single slice of a local variable, using either the slice or subscript operator (or the Slice or Subscript functions) on the left-hand side of the := operator.

For example, to change the value for one time point in an array, you could use either of these:

v[Time=5] := 0
v[@Time=5] := 0

When using slice assignment, the array MUST be a local variable, even if used from a button script.

Slice and Subscript assignment does array abstract. In the expressions

v[I=x]:= y 
v[@I=n]:= y

any of the parameters v, x, n, and y may have arbitrary dimensionality. Depending on these dimensionalities, slice assignment may increase the dimensionality of v, and may set many elements of v to the same value.

For example, suppose v is indexed only by I, but y contains the index J. Then after (v[I=x] := y) is evaluated, v will be indexed by both I and J. The slice corresponding to I=x will have a potentially different value for each element of J, but the other slices along I will be constant across J (having the original value of v).

Subscripting and Meta-Inference

Some advanced Analytica code may perform Meta-Inference, i.e., inference about the objects and structure of the model itself, or when initiated from button scripts, algorithms that may even modify or alter the structure of the model. These types of algorithms typically impact only the most advanced Analytica users, but for authors of such algorithms, there are some additional considerations.

These considerations enter when an index contains object identifiers or expressions as elements. The default behavior of Subscript requires these variables and expressions to be evaluated, and for those that evaluate to scalars, both the scalar value and the original varTerm or expression is recognized by Subscript. This is demonstrated by the following example:

Variable A := 5
Variable B := 10
Index I := [ A, B ]           { Global index, defined as List, with identifier A and B in each cell }
Variable C := Table(I)(3,4)
C[I=5] → 3
C[I=A] → 3
C[I=VarTerm(A)] → 3
IndexValue(I) → [A,B]        { These are VarTerms -- handles to objects }

There is a distinct difference between the first two subscript examples and the third. C[I=5] and C[I=A] are performing associative lookup based on the evaluated result of the index elements, while C[I=VarTerm(A)] is using the raw un-evaluated IndexValue.

When your intention is to reason about the structure of the objects in the model, the default functionality of subscript has a couple undesirable aspects. First, it forces the elements of the index to be evaluated. If those computations are expensive, they must be carried out before your meta-inference can proceed, and if there are errors or warnings while evaluating, those errors appear. Second, if some elements evaluate to varTerms, there may be ambiguities.

The following example demonstrates these concerns. The example collects the definition (e.g., for a report) of all the objects with definitions in module Report_A.

 Index allObjects := Contains of Report_A;
 var v := allObjects;
 Var allTitles := Title of v;
 Var allDefns  := Definition of v;
 Index Objects_With_Defs := Subset( not IsUndef(allTitles) );
 allDefns[allObjects = Objects_with_Defs]
 

When the subscript operation in the last line is evaluated, all the objects listed in the Objects index will need to be fully evaluated. If a variable in Module Report_A contains an error, the error will appear when the last line is evaluated. If a variable takes 5 hours to compute, this expression will trigger and wait for that computation, even though the result is not really needed here.

To avoid these problems, you must tell Analytica to treat the allObjects index as a meta-index, i.e., as an index containing literal expressions or identifiers, whose raw unevaluated values are to be used in associative lookup, but whose evaluated element values are not to be used. This is done by changing the first line to:

 MetaIndex allObjects := Contains of Report_A;
 ...

For a global index object, to achieve this treatment, you must set the MetaOnly attribute to 1 (true), see MetaOnly for instructions. In the previous example, when the metaOnly attribute set to true for Index I, the previous expressions evaluate as

C[I=5] → Null                {not found}
C[I=A] → Null                {not found}
C[I=VarTerm(A)] → 3

With MetaOnly set, index elements containing system variables such as INF, NULL, True, False, Pi, etc., will not match to the underlying value, since these are actually system variable objects. See MetaOnly for more details.

See Also

Comments


You are not allowed to post comments.