MdArrayToTable
MdArrayToTable(a, row, col, valueIndex, positional, omitZero, omitNull)
Transforms a multi-dimensional array, «a», into a two-dimensional relational table or a fact table. A relational table contains one row for each element of «a» that is not empty -- not containing zero or NULL (default).
All parameters except the first are optional. If you want to transform all dimensions of the array, it is easiest to omit the «row» and «col» parameters. When these are omitted, local indexes named .Row and .Col are automatically created for you with the correct number of elements.
Use the optional «row» and «col» parameters to provide the indexes for the final result explicitly. When these are omitted, the function does not abide by the law of array-abstraction (see Preserving Array-abstraction).
The rows of the result are indexed by «row» if specified, or a local index named .Row
if «row» is omitted. In a relational table, the first n columns along «col» identify the coordinate of the cell, while the last column holds the cell value. «col» has n+1 elements. A relational table is returned when the «valueIndex» parameter is omitted.
A fact table flattens «a» over all indexes except one, the «valueIndex». Each slice of «a» along the «valueIndex» appears as a column in the result. The first n columns of the result identify the coordinate in «a» and the last k columns correspond to the «valueIndex». «col» has n+k elements. There is one row for each combination of the index values of «a» excluding the «valueIndex». A fact table is still two-dimensional, but it is more compact, reducting the number of rows by a factor equal to the size of «valueIndex».
All parameters after «a» are optional. Hence, the easiest usage is just MdArrayToTable(a)
, which returns a relational table of all non-empty cells in the array.
When you specify «col», each of the first n elements of «col» identifies an index of «a», and to do that, it must be either a handle to the index or the textual identifier of the index. The last element of «col» in a relational table, or the last k elements in a fact table, corresponds to the value column(s), and you may use any label you wish. When you define the index you will use for «col», the Concat function is often used, e.g., Concat(IndexesOf(A),'A')
or for a fact table, Concat(#SetDifference(\IndexesOf(A),\Handle(valueIndex)),valueIndex)
. Notice that for a fact table, «col» should not include the «valueIndex».
You usually create an index, which you pass to the «col» parameter, but you also have the option of passing a one-dimensional array for «col». This can be useful if you want the index values to be something other than the handles or identifiers of the indexes of «a». When «col» in a 1-D array, it is the cells of the array (rather than the index elements) that must contain the index handle or identifier.
You are not required to include all indexes of «a» in «col». In this case, you obtain a partial flattening, where «a» is flattened only over the indexes that appear in «col». All other indexes are array abstracted, and appear as indexes of the result.
When you omit «col», a local index named .Col
is created for you, which contains all the indexes of «a» other than «valueIndex» as handles. For a relational table, the last column is named 'Value'
, and for a fact table, the last k columns are the elements of «valueIndex».
Creating Sparse Relational Tables
You can elect to include only the non-null / non-zero elements of «a» in the resulting table. Normally, when you want all elements, you provide an index «row» with a length equal to the total number of cells in the array. To include only the non-zero / non-null elements, provide an index with a length equal to the number of non-null / non-zero elements. If you omit «row», only the non-empty elements are included, although the «omitZero» and «omitNull» flags alter this.
If you want to include zeroes, but not Nulls, then size «row» for the number of non-Null values and specify:
- MdArrayToTable(...,omitZero:false)
Likewise, if you want to include Null cells as rows, but not zeros, specify:
- MdArrayToTable(...,omitNull:false)
Creating a fact table
- MdArrayToTable(A,Row,Col,valueIndex)
A fact table is a generalization of a simple relational table. In a fact table, the first N columns identify cell coordinates, and the last K columns hold measured values (often called measures, facts, or just values). This generalized the simple relational table where only the single final row holds a value.
The optional parameter, «valueIndex», specifies an index of «a» that is used for the value columns in the result. The first Size(Col) - Size(valueIndex) columns of «col» are then the coordinates, and the final Size(valueIndex) columns of «col» hold the values.
When omitting cells having no data, all values in «a» along the «valueIndex» must be zero or null. If any of the values have data, the corresponding cell (i.e., row in the result) is considered to have data.
Positional Coordinates
- MdArrayToTable(..., positional: true)
Normally, the first N columns of the result specify coordinates using index labels. When the optional parameter «positional» is specified as true, index positions are returned rather than labels.
Customizing Column Names
Normally, the column labels in «col» are just the index identifiers. However, you may need to use different labels, such as labels to match column names in an external database, or more human-readable names. To do this, define an index, say Column, with the desired labels. Then create a Table indexed by Column, call it L, and fill in the index names in the table. Pass L as the third parameter to MdArrayToTable.
Library
Array
Examples
Define these global objects (as indexes or variables):
Rows := sequence(1,size(Cost_in_time)) Cols := ["Mpg","Time","Car_type","Cost"]
MDArrayToTable(Cost_in_time,Rows,Cols) →
Mpg | Time | Car_type | Cost | |
---|---|---|---|---|
1 | 26 | 0 | VW | 2185 |
2 | 26 | 0 | Honda | 2385 |
3 | 26 | 0 | BMW | 3185 |
4 | 26 | 1 | VW | 2294 |
5 | 26 | 1 | Honda | 2314 |
6 | 26 | 1 | BMW | 3294 |
7 | 26 | 2 | VW | 2409 |
... | ||||
45 | 35 | 4 | BMW | 5175 |
Note: The expression for doing the same transformation using local indexes looks like
Index Rows := Sequence(1,Size(Cost_in_time)); Index Cols := ["Mpg","Time","Car_type","Cost"]; MdArrayToTable(Cost_in_time,Rows,Cols)
Positional example
Changing the previous example to:
MdArrayToTable(Cost_in_time,Rows,Cols,positional:true)
causes the coordinates in the first 3 columns to be expressed as index positions, rather than index labels.
Mpg | Time | Car_type | Cost | |
---|---|---|---|---|
1 | 1 | 1 | 1 | 2185 |
2 | 1 | 1 | 2 | 2385 |
3 | 1 | 1 | 3 | 3185 |
4 | 1 | 2 | 1 | 2294 |
5 | 1 | 2 | 2 | 2314 |
6 | 1 | 2 | 3 | 3294 |
7 | 1 | 3 | 1 | 2409 |
... | ||||
45 | 3 | 5 | 3 | 5175 |
Creating a Fact Table
In this example, well provide a separate column for each car-type, rather than including that as a relational dimension. The numbers in the columns are the cost.
Index Row := 1..Sum(1, Mpg, Time) Index Label := [Mpg,Time,'VW','Honda','BMW']; Variable Cost := MdArrayToTable(Cost_in_time, Row, Label, valueIndex: Car_type)
Cost → |
|
Tips and Tricks
Textual Index Names vs. Handles
You can specify indexes for the first N columns of «col» using text index names, or using handles. Which is better? The advantage of using handles is that, if you rename one of the indexes, it will automatically update the name in the definition, as it does in other definitions referring to that identifier. If you use text names, it will cause an "Unidentified identifier" error unless or until you edit the name in «col».
When defining «col» using handles, you should define an index as a List (not a list-of-labels), and then enter the index identifiers into the first N cells. For the final cell, you can enter anything, but I often just enter the identifier for the array being transformed. When using a list-of-identifiers like this, it is essential to set the MetaOnly attribute to 1. With the Index node selected in the Diagram,
- open the Attribute panel,
- click the Attribute pulldown,
- select metaOnly and
- enter 1 into the panel.
This prevents it from evaluating the variables in the list, and makes sure that the identifiers (handles actually)rather than their values that get passed as the value of «col».
Preserving Array-abstraction
Most Analytica functions are array-abstractable, meaning that they adhere to the law of array-abstraction, a very powerful principle that, when composed properly, results in models that are array-abstractable. The law of array abstraction states that for any unit of computation, f(x,y,..)
, that computes a result based on inputs x,y,..., and which does not operate over index I, it holds that:
f(x[I=v],y[I=v],..) = f(x,y,...)[I=v]
The law ensures that we can add a dimension to the inputs of computational unit, while preserving a consistency with the results before the dimension was present: For any input slice having the original values, the corresponding output slice will compute the original value.
This principle becomes particular powerful from the fact that the composition of array-abstractable expressions is array-abstractable. Thus, as long as you emply only array-abstractable constructs, your entire will be fully array-abstractable. This enable you to rely on power analyses, including parametric, sensitivity, uncertainty, and scenario analyses.
MdArrayToTable is an example of an array-abstractable function when you specify «row» and «col» parameters explicitly with a fixed number of indexes in «col». For example, you might list two indexes to be transformed (so that the index of «col» has length three, the last column to hold the value). If we later supply a three dimensional array, two of the dimensions are transformed to a relational table, the extra third dimension is array-abstracted, resulting in an array of relational tables. This transformation takes a while to get your head around -- but it is an exact application of the law of array-abstraction above.
A common practice is to use MdArrayToTable to transform all dimensions of an array into the relational table. This is done by omitting the «row» and «col» parameters, or by creating general purpose code that collects all existing dimensions. This application of MdArrayToTable creates a violation of the law of array-abstraction. The violation implies that you can no longer assume that any model that does this internally is consistent with the law of array-abstraction. This generalized transformation of arbitrary dimensions does require some special meta-constructs around MdArrayToTable. As long as you understand the implications of doing this, there are certainly legitimate uses of a full transformation, especially in points that interface with databases, spreadsheets, or programs external to Analytica. When it is used as part of an algorithm within your model, you should use it in an array-abstractable manner otherwise parametric, sensitivity and scenario analyses over your model may produce incorrect results.
Transforming all dimensions
It's often useful to convert all dimensions of an array into a 2D table -- even if it does mean the code is not fully array abstractable. In this case, any dimension that you add (remove) to «a» in the future will add (remove) a column from the 2-D result, whereas in a purely array-abstractable function the introduction of a new dimension to «a» would become a new dimension of the result.
You can easily full flatten an array fully by using MdArrayToTable(A)
-- it adds local indexes .Row and .Col, if you omit parameters «row» and «col».
Otherwise, you need to create and define «row» and «col» yourself. The index «col» must have an element for each dimension of x, plus one extra element for the array's value:
Index Col := Concat(IndexesOf(A),'Value')
The row index to have either the total number of elements in A, as in:
Index Row := 1..Size(A)
Or if you only want to exclude rows that contain NULL or zero, use:
Index Row := 1..SumAll(A<>Null and A<>0)
where SumAll is a user-defined function that sums x over all its indexes:
Function SumAll(x) := LocalAlias I:=IndexesOf(x) Do x := Sum(x,I); x
Now you can call the function as usual:
MdArrayToTable(A,Row,Col)
See Also
History
The optional parameters «Row» and «Col» were required before Analytica 4.5 and were previously named «I» and «L». The optional parameters «omitZero» and «omitNull» were introduced in Analytica 4.2. The Fact table and parameter «valueIndex» and optional parameter «positional» were also introduced in Analytica 4.2.
Enable comment auto-refresher