Difference between revisions of "Ensuring Array Abstraction"
Jhernandez3 (talk | contribs) m |
|||
Line 1: | Line 1: | ||
[[Category: Analytica User Guide]] | [[Category: Analytica User Guide]] | ||
− | <breadcrumbs> Analytica User Guide > Procedural Programming > {{PAGENAME}}</breadcrumbs> | + | <breadcrumbs> Analytica User Guide > Procedural Programming > {{PAGENAME}}</breadcrumbs> |
+ | |||
+ | |||
+ | __TOC__ | ||
The vast majority of the elements of the Analytica language (operators, functions, and control constructs) fully support Intelligent Arrays — that is, they can handle operands or parameters that are arrays with any number of indexes, and generate a result with the appropriate dimensions. Thus, most models automatically obtain the benefits of array abstraction with no special care. | The vast majority of the elements of the Analytica language (operators, functions, and control constructs) fully support Intelligent Arrays — that is, they can handle operands or parameters that are arrays with any number of indexes, and generate a result with the appropriate dimensions. Thus, most models automatically obtain the benefits of array abstraction with no special care. | ||
Line 6: | Line 9: | ||
There are just a few elements that do not inherently enable Intelligent Arrays — i.e., support array abstraction. They fall into these main types: | There are just a few elements that do not inherently enable Intelligent Arrays — i.e., support array abstraction. They fall into these main types: | ||
− | * '''Functions whose parameters must be atoms''' (not arrays), including [ | + | * '''Functions whose parameters must be atoms''' (not arrays), including [[Sequence]], m..n, and [[SplitText]]. See below. |
− | * Functions whose parameter must be a vector (an array with just one index), such as [ | + | * Functions whose parameter must be a vector (an array with just one index), such as [[CopyIndex]], [[SortIndex]], [[Subset]], [[Unique]], and [Concat]] when called with two parameters. |
− | * The [ | + | * The [[=For_and_While_Loops#While.28Test.29_Do_Body|While loop]], which requires its termination condition to be an atom. |
* If [http://wiki.analytica.com/index.php?title=Ensuring_Array_Abstraction#If_a_Then_b_Else_c_and_array_abstraction b Then c Else d], when condition <code>b</code> is an array, and <code>c</code> or <code>d</code> can give an evaluation error. | * If [http://wiki.analytica.com/index.php?title=Ensuring_Array_Abstraction#If_a_Then_b_Else_c_and_array_abstraction b Then c Else d], when condition <code>b</code> is an array, and <code>c</code> or <code>d</code> can give an evaluation error. | ||
− | * Functions with an optional index parameter that is '''[ | + | * Functions with an optional index parameter that is '''[[Ensuring_Array_Abstraction#Omitted_index_parameters_and_array_abstraction|omitted]''', such as [[Sum]](x), [[Product]], [[Functions_Min_and_Max|Max]], [[Functions_Min_and_Max|Min]], [[Average]], [[Argmax]], [[SubIndex]], [[ChanceDist]], [[CumDist]], and [[ProbDist]].<br /> |
Line 21: | Line 24: | ||
:<code>Variable N := 1..3</code> | :<code>Variable N := 1..3</code> | ||
:<code>Variable B := 1..N</code> | :<code>Variable B := 1..N</code> | ||
− | :<code>B | + | :<code>B → Evaluation error:</code> |
− | :<code>One or both parameters to Sequence(m, n) or m .. n are not scalars.</code> | + | :<code>One or both parameters to Sequence(m, n) or m..n are not scalars.</code> |
− | The expression <code>1..N</code>, or equivalently, | + | The expression <code>1..N</code>, or equivalently, <code>Sequence(1, N)</code>, cannot work if <code>N</code> is an array, because it would have to create a nonrectangular array containing slices with 1, 2, and 3 elements. Analytica does not allow nonrectangular arrays, and so requires the parameters of [[Sequence]] to be atoms (single elements). |
− | Most functions and expressions that, like [ | + | Most functions and expressions that, like [[Sequence]], are used to generate the definition of an index require atomic (or in some cases, vector) parameters, and so are not fully array abstractable. These include [[Sequence]], [[Subset]], [[SplitText]], [[SortIndex]] (if the second parameter is omitted), [[Concat]], [[CopyIndex]], and [[Unique]]. |
Why would you want array abstraction using such a function? Consider this approach to writing a function to compute a factorial: | Why would you want array abstraction using such a function? Consider this approach to writing a function to compute a factorial: | ||
Line 34: | Line 37: | ||
:<code>Definition: Product(1..n)</code> | :<code>Definition: Product(1..n)</code> | ||
− | It works if <code>n</code> is an atom, but not if it is an array, because <code>1..n</code> requires atom operands. In this version, however, using a | + | It works if <code>n</code> is an atom, but not if it is an array, because <code>1..n</code> requires atom operands. In this version, however, using a [[For]] loop works fine: |
:<code>Function Factorial3</code> | :<code>Function Factorial3</code> | ||
Line 53: | Line 56: | ||
:<code>Index K := 1 .. 6</code> | :<code>Index K := 1 .. 6</code> | ||
− | :<code>Factorial3(K) | + | :<code>Factorial3(K) →</code> |
::[[File:result-result.png|400px]] | ::[[File:result-result.png|400px]] | ||
Line 63: | Line 66: | ||
==While and array abstraction== | ==While and array abstraction== | ||
− | The <code>While b Do e</code> construct requires its termination condition <code>b</code> to evaluate to be an atom — that is, a single Boolean value, <code>True (1)</code> or <code>False (0)</code>. Otherwise, it would be ambiguous about whether to continue. Again, '''Atom''' is useful to ensure that a function using a '''While''' loop array abstracts, as it was for the [ | + | The <code>While b Do e</code> construct requires its termination condition <code>b</code> to evaluate to be an atom — that is, a single Boolean value, <code>True (1)</code> or <code>False (0)</code>. Otherwise, it would be ambiguous about whether to continue. Again, '''Atom''' is useful to ensure that a function using a '''While''' loop array abstracts, as it was for the [[Sequence]] function. Here’s a way to write a [[Factorial]] function using a While loop: |
:<code>Function Factorial4</code> | :<code>Function Factorial4</code> | ||
Line 69: | Line 72: | ||
:<code>Definition:</code> | :<code>Definition:</code> | ||
::<code>VAR fact := 1; VAR a := 1;</code> | ::<code>VAR fact := 1; VAR a := 1;</code> | ||
− | ::<code>WHILE a < n DO (a := a + 1; fact := fact * a)</code> | + | ::<code>WHILE a < n DO (a := a + 1; fact := fact*a)</code> |
− | In this example, the '''Atom''' qualifier assures that <code>n</code> and hence the | + | In this example, the '''Atom''' qualifier assures that <code>n</code> and hence the [[While]] termination condition <code>a < n</code> is an atom during each evaluation of '''Factorial4'''. |
==If a Then b Else c and array abstraction== | ==If a Then b Else c and array abstraction== | ||
Line 78: | Line 81: | ||
:<code>Variable X := -2..2</code> | :<code>Variable X := -2..2</code> | ||
− | :<code>Sqrt(X) | + | :<code>Sqrt(X) → [NAN, NAN, 0, 1, 1.414]</code> |
The square root of negative numbers -2 and -1 returns '''NAN''' (not a number) after issuing a warning. Now consider the definition of '''Y''': | The square root of negative numbers -2 and -1 returns '''NAN''' (not a number) after issuing a warning. Now consider the definition of '''Y''': | ||
− | :<code>Variable Y := (IF X>0 THEN Sqrt(X) ELSE 0)</code> | + | :<code>Variable Y := (IF X > 0 THEN Sqrt(X) ELSE 0)</code> |
− | :<code>Y | + | :<code>Y → [0, 0, 0, 1 1.414]</code> |
− | For the construct IF a THEN b ELSE c, <code>a</code> is an array of truth values, as in this case, so it evaluates both <code>b</code> and <code>c</code>. It returns the corresponding elements of <code>b</code> or <code>c</code>, according to the value of condition <code>a</code> for each index value. Thus, it still ends up evaluating [ | + | For the construct IF a THEN b ELSE c, <code>a</code> is an array of truth values, as in this case, so it evaluates both <code>b</code> and <code>c</code>. It returns the corresponding elements of <code>b</code> or <code>c</code>, according to the value of condition <code>a</code> for each index value. Thus, it still ends up evaluating [[Sqrt]](X) even for negative values of <code>X</code>. In this case, it returns <code>0</code> for those values, rather than '''NAN''', and so it does not generate an error message. |
A similar problem remains with text processing functions that require a parameter to be a text value. Consider this array: | A similar problem remains with text processing functions that require a parameter to be a text value. Consider this array: | ||
Line 96: | Line 99: | ||
:<code>Parameters: (t)</code> | :<code>Parameters: (t)</code> | ||
:<code>Definition: Evaluate(TextReplace(t, ',', ''))''</code> | :<code>Definition: Evaluate(TextReplace(t, ',', ''))''</code> | ||
− | :<code>RemoveCommas(Z) | + | :<code>RemoveCommas(Z) →</code> |
:<code>Evaluation Error: The parameter of Pluginfunction TextReplace must be a text while evaluating function RemoveCommas.</code> | :<code>Evaluation Error: The parameter of Pluginfunction TextReplace must be a text while evaluating function RemoveCommas.</code> | ||
− | [ | + | [[TextReplace]] doesn’t like the first value of <code>z</code>, which is a number, where it’s expecting a text value. What if we test if <code>t</code> is text and only applies [[TextReplace]] when it is? |
:<code>Function RemoveCommas(t)</code> | :<code>Function RemoveCommas(t)</code> | ||
Line 105: | Line 108: | ||
:<code>Definition: IF IsText(t) | :<code>Definition: IF IsText(t) | ||
::THEN Evaluate(TextReplace(t, ',', '')) ELSE t''</code> | ::THEN Evaluate(TextReplace(t, ',', '')) ELSE t''</code> | ||
− | :<code>RemoveCommas(Z) | + | :<code>RemoveCommas(Z) → (same error message)</code> |
It still doesn’t work because the '''IF''' construct still applies ReplaceText to all elements of <code>t</code>. Now, let’s add the parameter qualifier '''Atom''' to <code>t</code>: | It still doesn’t work because the '''IF''' construct still applies ReplaceText to all elements of <code>t</code>. Now, let’s add the parameter qualifier '''Atom''' to <code>t</code>: | ||
Line 113: | Line 116: | ||
:<code>Definition: IF IsText(t)</code> | :<code>Definition: IF IsText(t)</code> | ||
::<code>THEN Evaluate(TextReplace(t, ',', '')) ELSE t''</code> | ::<code>THEN Evaluate(TextReplace(t, ',', '')) ELSE t''</code> | ||
− | :<code>RemoveCommas(Z) | + | :<code>RemoveCommas(Z) →</code> |
:::[[File:result3.png|200px]] | :::[[File:result3.png|200px]] | ||
− | This works fine because the '''Atom''' qualifier means that '''<code>RemoveCommas</code>''' breaks its parameter <code>t</code> down into atomic elements before evaluating the function. During each evaluation of <code>Remove-Commas</code>, <code>t</code>, and hence [ | + | This works fine because the '''Atom''' qualifier means that '''<code>RemoveCommas</code>''' breaks its parameter <code>t</code> down into atomic elements before evaluating the function. During each evaluation of <code>Remove-Commas</code>, <code>t</code>, and hence [[Data_Type_Functions#Function_IsText|IsText]](t), is atomic, either True or False. When False, the '''If''' construct evaluates the '''Else''' part but not the '''Then''' part, and so calls [[TextReplace]] when t is truly a text value. After calling [[TextReplace]] separately for each element, it reassembles the results into the array shown above with the same index as <code>Z</code>. |
==Omitted index parameters and array abstraction== | ==Omitted index parameters and array abstraction== | ||
− | Several functions have index parameters that are optional, including [ | + | Several functions have index parameters that are optional, including [[Sum]], [[Product]], [[Functions_Min_and_Max|Max]], [[Functions_Min_and_Max|Min]], [[Average]], [[Argmax]], [[SubIndex]], [[ChanceDist]], [[CumDist]] and [[ProbDist]]. For example, with [[Sum]](x, i), you can omit index <code>i</code>, and call it as [[Sum]](x). But, if <code>x</code> has more than one index, it is hard to predict which index it sums over. Even if <code>x</code> has only one dimension now, you might add other dimensions later, for example for parametric analysis. This ambiguity makes the use of functions with omitted index parameters non-array abstractable. |
There is a simple way to avoid this problem and maintain reliable array abstraction: '''''When using functions with optional index parameters, never omit the index!''''' Almost always, you know what you want to sum over, so mention it explicitly. If you add dimensions later, you’ll be glad you did. | There is a simple way to avoid this problem and maintain reliable array abstraction: '''''When using functions with optional index parameters, never omit the index!''''' Almost always, you know what you want to sum over, so mention it explicitly. If you add dimensions later, you’ll be glad you did. | ||
Line 129: | Line 132: | ||
==Selecting indexes for iterating with For and Var== | ==Selecting indexes for iterating with For and Var== | ||
− | To provide detailed control over array abstraction, the '''For''' loop can specify exactly which indexes to use in the iterator <code>x</code>. The old edition of '''For''' still works. It requires that the expression <code>a</code> assigned to iterator <code>x</code> generate an index — that is, it must be a defined index variable, [ | + | To provide detailed control over array abstraction, the '''For''' loop can specify exactly which indexes to use in the iterator <code>x</code>. The old edition of '''For''' still works. It requires that the expression <code>a</code> assigned to iterator <code>x</code> generate an index — that is, it must be a defined index variable, [[Sequence]](m, n), or m..n. The new forms of '''For''' are more flexible. They work for any array (or even atomic) value <code>a</code>. The loop iterates by assigning to x successive subarrays of <code>a</code>, dimensioned by the indexes listed in square brackets. If the square brackets are empty, as in the second line of the table, the successive values of iterator <code>x</code> are atoms. In the other cases, the indexes mentioned specify the dimensions of <code>x</code> to be used in each evaluation of <code>e</code>. In all cases, the final result of executing the '''For''' loop is a value with the same dimensions as <code>a</code>. |
:{| class="wikitable" | :{| class="wikitable" |
Revision as of 00:04, 7 January 2016
The vast majority of the elements of the Analytica language (operators, functions, and control constructs) fully support Intelligent Arrays — that is, they can handle operands or parameters that are arrays with any number of indexes, and generate a result with the appropriate dimensions. Thus, most models automatically obtain the benefits of array abstraction with no special care.
There are just a few elements that do not inherently enable Intelligent Arrays — i.e., support array abstraction. They fall into these main types:
- Functions whose parameters must be atoms (not arrays), including Sequence, m..n, and SplitText. See below.
- Functions whose parameter must be a vector (an array with just one index), such as CopyIndex, SortIndex, Subset, Unique, and [Concat]] when called with two parameters.
- The While loop, which requires its termination condition to be an atom.
- If b Then c Else d, when condition
b
is an array, andc
ord
can give an evaluation error. - Functions with an optional index parameter that is [[Ensuring_Array_Abstraction#Omitted_index_parameters_and_array_abstraction|omitted], such as Sum(x), Product, Max, Min, Average, Argmax, SubIndex, ChanceDist, CumDist, and ProbDist.
When using these constructs, you must take special care to ensure that your model is fully arrayabstractable. Here we explain how to do this for each of these five types.
Functions Expecting Atomic Parameters
Consider this example:
Variable N := 1..3
Variable B := 1..N
B → Evaluation error:
One or both parameters to Sequence(m, n) or m..n are not scalars.
The expression 1..N
, or equivalently, Sequence(1, N)
, cannot work if N
is an array, because it would have to create a nonrectangular array containing slices with 1, 2, and 3 elements. Analytica does not allow nonrectangular arrays, and so requires the parameters of Sequence to be atoms (single elements).
Most functions and expressions that, like Sequence, are used to generate the definition of an index require atomic (or in some cases, vector) parameters, and so are not fully array abstractable. These include Sequence, Subset, SplitText, SortIndex (if the second parameter is omitted), Concat, CopyIndex, and Unique.
Why would you want array abstraction using such a function? Consider this approach to writing a function to compute a factorial:
Function Factorial2
Parameters: (n)
Definition: Product(1..n)
It works if n
is an atom, but not if it is an array, because 1..n
requires atom operands. In this version, however, using a For loop works fine:
Function Factorial3
Parameters: (n)
Definition: FOR m := n DO Product(1..m)
The For loop repeats with the loop variable m
set to each atom of n
, and evaluates the body Product(1..m)
for each value. Because m
is guaranteed to be an atom, this works fine. The For loop reassembles the result of each evaluation of Product(1..m)
to create an array with all the same dimensions as n
.
Atom parameters and array abstraction
Another way to ensure array abstraction in a function is to use the Atom qualifier for its parameter( s). When you qualify a parameter n
as an Atom, you are saying that it must be a single value — not an array — when the function is evaluated, but not when the function is used:
Function Factorial3
Parameters: (n: Atom)
Definition: Product(1..n)
Index K := 1 .. 6
Factorial3(K) →
Notice that Atom does not require the actual parameter K
to be an atom when the function is called. If K
is an array, as in this case, it repeatedly evaluates the function Factorial3(n)
with n
set to each atom of array K
. It then reassembles the results back into an array with the same indexes as parameter K, like the For loop above. This scheme works fine even if you qualify several parameters of the function as Atom.
In some cases, a function might require a parameter to be an vector (have only one index), or have multiple dimensions with specified indexes. You can use Array qualifiers to specify this. With this approach, you can ensure your function array abstracts when new dimensions are added to your model, or if parameters are probabilistic.
While and array abstraction
The While b Do e
construct requires its termination condition b
to evaluate to be an atom — that is, a single Boolean value, True (1)
or False (0)
. Otherwise, it would be ambiguous about whether to continue. Again, Atom is useful to ensure that a function using a While loop array abstracts, as it was for the Sequence function. Here’s a way to write a Factorial function using a While loop:
Function Factorial4
Parameters: (n: Atom)
Definition:
VAR fact := 1; VAR a := 1;
WHILE a < n DO (a := a + 1; fact := fact*a)
In this example, the Atom qualifier assures that n
and hence the While termination condition a < n
is an atom during each evaluation of Factorial4.
If a Then b Else c and array abstraction
Consider this example:
Variable X := -2..2
Sqrt(X) → [NAN, NAN, 0, 1, 1.414]
The square root of negative numbers -2 and -1 returns NAN (not a number) after issuing a warning. Now consider the definition of Y:
Variable Y := (IF X > 0 THEN Sqrt(X) ELSE 0)
Y → [0, 0, 0, 1 1.414]
For the construct IF a THEN b ELSE c, a
is an array of truth values, as in this case, so it evaluates both b
and c
. It returns the corresponding elements of b
or c
, according to the value of condition a
for each index value. Thus, it still ends up evaluating Sqrt(X) even for negative values of X
. In this case, it returns 0
for those values, rather than NAN, and so it does not generate an error message.
A similar problem remains with text processing functions that require a parameter to be a text value. Consider this array:
Variable Z := [1000, '10,000', '100,000']
This kind of array containing true numbers, e.g., 1000, and numbers with commas turned into text values, often arises when copying arrays of numbers from spreadsheets. The following function would seem helpful to remove the commas and convert the text values into numbers:
Function RemoveCommas(t)
Parameters: (t)
Definition: Evaluate(TextReplace(t, ',', ))
RemoveCommas(Z) →
Evaluation Error: The parameter of Pluginfunction TextReplace must be a text while evaluating function RemoveCommas.
TextReplace doesn’t like the first value of z
, which is a number, where it’s expecting a text value. What if we test if t
is text and only applies TextReplace when it is?
Function RemoveCommas(t)
Parameters: (t)
Definition: IF IsText(t)
THEN Evaluate(TextReplace(t, ',', )) ELSE t
RemoveCommas(Z) → (same error message)
It still doesn’t work because the IF construct still applies ReplaceText to all elements of t
. Now, let’s add the parameter qualifier Atom to t
:
Function RemoveCommas(t)
Parameters: (t: Atom)
Definition: IF IsText(t)
THEN Evaluate(TextReplace(t, ',', )) ELSE t
RemoveCommas(Z) →
This works fine because the Atom qualifier means that RemoveCommas
breaks its parameter t
down into atomic elements before evaluating the function. During each evaluation of Remove-Commas
, t
, and hence IsText(t), is atomic, either True or False. When False, the If construct evaluates the Else part but not the Then part, and so calls TextReplace when t is truly a text value. After calling TextReplace separately for each element, it reassembles the results into the array shown above with the same index as Z
.
Omitted index parameters and array abstraction
Several functions have index parameters that are optional, including Sum, Product, Max, Min, Average, Argmax, SubIndex, ChanceDist, CumDist and ProbDist. For example, with Sum(x, i), you can omit index i
, and call it as Sum(x). But, if x
has more than one index, it is hard to predict which index it sums over. Even if x
has only one dimension now, you might add other dimensions later, for example for parametric analysis. This ambiguity makes the use of functions with omitted index parameters non-array abstractable.
There is a simple way to avoid this problem and maintain reliable array abstraction: When using functions with optional index parameters, never omit the index! Almost always, you know what you want to sum over, so mention it explicitly. If you add dimensions later, you’ll be glad you did.
Selecting indexes for iterating with For and Var
To provide detailed control over array abstraction, the For loop can specify exactly which indexes to use in the iterator x
. The old edition of For still works. It requires that the expression a
assigned to iterator x
generate an index — that is, it must be a defined index variable, Sequence(m, n), or m..n. The new forms of For are more flexible. They work for any array (or even atomic) value a
. The loop iterates by assigning to x successive subarrays of a
, dimensioned by the indexes listed in square brackets. If the square brackets are empty, as in the second line of the table, the successive values of iterator x
are atoms. In the other cases, the indexes mentioned specify the dimensions of x
to be used in each evaluation of e
. In all cases, the final result of executing the For loop is a value with the same dimensions as a
.
For x := a DO e Assigns to loop variable x
successive atoms from index expressiona
and repeats evaluation expressione
for each value. Returns an array of values ofe
indexed bya
.For x := a DO e For x[] := a DO e|D
Assigns to loop variable x
, successive atomic values from arraya
. It repeats evaluation of expressione
for each value. It returns an array of values ofe
with the same indexes asa
.For x[i] := a DO e Assigns to loop variable x
successive subarrays from arraya
, each indexed only byi
. It repeats evaluation of expressione
for each index value ofa
other thani
. As before, the result has the same indexes asa
.For x[i, j …] := a DO e Assigns to loop variable x
successive subarrays from arraya
, each indexed only byi, j ….
It repeats evaluation of expressione
for each index value ofa
other thani, j ….
. As before, the result has the same indexes asa
.
The same approach also works using Var to define local variables. By putting square brackets listing indexes after the new variable, you can specify the exact dimensions of the variable. These indexes should be a subset (none, one, some, or all) of the indexes of the assigned value a
. Any subsequent expressions in the context are automatically repeated as each subarray is assigned to the local variable. In this way, a local variable can act as an implicit iterator, like the For loop.
Var Temp[i1, i2, ...] := X;
Enable comment auto-refresher