MdArrayToTable
MdArrayToTable(A,I,L,valueIndex,positional,omitZero,omitNull)
Transforms a multi-dimensional array, «A», into a two-dimensional relational table, or fact table, indexed by «I» and «L».
A relational table contains one row along «I» for each element of «A». The first n columns along «L» identify the coordinate of the cell, while the last column holds the cell value. «L» has n+1 elements. A relational table is returned when the «valueIndex» parameter is omitted.
A fact table flattens «A» ovor all indexes except one, the «valueIndex». Each slice of «A» along the «valueIndex» appears as a column in the result. The first n columns of the result identify the coordinate in «A» and the last k columns correspond to the «valueIndex». «L» has n+k elements. There is one row for each combination of the index values of «A» excluding the «valueIndex».
As of Analytica 4.5, all parameters after «A» are optional. Hence, the easiest usage is just MdArrayToTable(A)
, which returns a relational table of all non-empty cells in the array. In Analytica 4.4 and earlier, you must provide the «I» and «L» indexes.
When you specify «L», each of the first n elements of «L» identifies an index of «A», and to do that, it must be either a handle to the index or the textual identifier of the index. The last element of «L» in a relational table, or the last k elements in a fact table, corresponds to the value column(s), and you may use any label you wish. When you define the index you will use for «L», the Concat function is often used, e.g., Concat(IndexesOf(A),'A')
or for a fact table, Concat(#SetDifference(\IndexesOf(A),\Handle(valueIndex)),valueIndex)
. Notice that for a fact table, «L» should not include the «valueIndex».
You usually create an index, which you pass to the «L» parameter, but you also have the option of passing a one-dimensional array for «L». This can be useful if you want the index values to be something other than the handles or identifiers of the indexes of «A». When «L» in a 1-D array, it is the cells of the array (rather than the index elements) that must contain the index handle or identifier.
You are not required to include all indexes of «A» in «L». In this case, you obtain a partial flattening, where «A» is flattened only over the indexes that appear in «L». All other indexes are array abstracted, and appear as indexes of the result.
(new to [[Analytica 4.5]) When you omit «L», a local index named L
is created for you, which contains all the indexes of «A» other than «valueIndex» as handles. For a relational table, the last column is named 'Value'
, and for a fact table, the last k columns are the elements of «valueIndex».
Creating Sparse Relational Tables
You can elect to include only the non-null / non-zero elements of «A» in the resulting table. Normally, when you want all elements, you provide an index «I» with a length equal to the total number of cells in the array. To include only the non-zero / non-null elements, provide an index with a length equal to the number of non-null / non-zero elements. If you omit «I» (Analytica 4.5 or higher), only the non-empty elements are included, although the «omitZero» and «omitNull» flags alter this.
New to Analytica 4.2: Prior to 4.2, only zero elements were excluded. The ability to exclude Null elements is new. The «omitZero» and «omitNull» parameters are new to 4.2.
If you want to include zeroes, but not Nulls, then size «I» for the number of non-Null values and specify:
- MdArrayToTable(...,omitZero:false)
Likewise, if you want to include Null cells as rows, but not zeros, specify:
- MdArrayToTable(...,omitNull:false)
Creating a fact table
New to Analytica 4.2
- MdArrayToTable(A,I,L,valueIndex)
A fact table is a generalization of a simple relational table. In a fact table, the first N columns identify cell coordinates, and the last K columns hold measured values (often called measures, facts, or just values). This generalized the simple relational table where only the single final row holds a value.
The optional parameter, «valueIndex», specifies an index of «A» that is used for the value columns in the result. The first Size(L) - Size(valueIndex) columns of «L» are then the coordinates, and the final Size(valueIndex) columns of «L» hold the values.
When omitting cells having no data, all values in «A» along the valueIndex must be zero or null. If any of the values have data, the corresponding cell (i.e., row in the result) is considered to have data.
Positional Coordinates
New to Analytica 4.2
- MdArrayToTable(...,positional:true)
Normally, the first N columns of the result specify coordinates using index labels. When the optional parameter «positional» is specified as true, index positions are returned rather than labels.
Customizing Column Names
Normally, the column labels in «L» are just the index identifiers. However, you may need to use different labels, such as labels to match column names in an external database, or more human-readable names. To do this, define an index, say Column, with the desired labels. Then create a Table indexed by Column, call it L, and fill in the index names in the table. Pass L as the third parameter to MdArrayToTable.
Library
Array
Examples
Define these global objects (as indexes or variables):
Rows := sequence(1,size(Cost_in_time)) Cols := [‘Mpg’,’Time’,’Car_type’,’Cost’]
MDArrayToTable(Cost_in_time,Rows,Cols) →
Mpg | Time | Car_type | Cost | |
---|---|---|---|---|
1 | 26 | 0 | VW | 2185 |
2 | 26 | 0 | Honda | 2385 |
3 | 26 | 0 | BMW | 3185 |
4 | 26 | 1 | VW | 2294 |
5 | 26 | 1 | Honda | 2314 |
6 | 26 | 1 | BMW | 3294 |
7 | 26 | 2 | VW | 2409 |
... | ||||
45 | 35 | 4 | BMW | 5175 |
Note: The expression for doing the same transformation using local indexes looks like
Index Rows := Sequence(1,Size(Cost_in_time)) Index Cols := [‘Mpg’,’Time’,’Car_type’,’Cost’] MdArrayToTable(Cost_in_time,Rows,Cols)
Positional example
Changing the previous example to:
MdArrayToTable(Cost_in_time,Rows,Cols,positional:true)
causes the coordinates in the first 3 columns to be expressed as index positions, rather than index labels.
Mpg | Time | Car_type | Cost | |
---|---|---|---|---|
1 | 1 | 1 | 1 | 2185 |
2 | 1 | 1 | 2 | 2385 |
3 | 1 | 1 | 3 | 3185 |
4 | 1 | 2 | 1 | 2294 |
5 | 1 | 2 | 2 | 2314 |
6 | 1 | 2 | 3 | 3294 |
7 | 1 | 3 | 1 | 2409 |
... | ||||
45 | 3 | 5 | 3 | 5175 |
Creating a Fact Table
In this example, well provide a separate column for each car-type, rather than including that as a relational dimension. The numbers in the columns are the cost.
Index Row := 1..Sum(1,Mpg,Time) Index Label := [Mpg,Time,'VW','Honda','BMW']; Variable Cost := MdArrayToTable(Cost_in_time,Row,Label,valueIndex:Car_type)
Cost → |
|
Tips and Tricks
Textual Index Names vs. Handles
You can specify indexes for the first N columns of «L» using text index names, or using handles. Which is better? The advantage of using handles is that, if you rename one of the indexes, it will automatically update the name in the definition, as it does in other definitions referring to that identifier. If you use text names, it will cause an "Unidentified identifier" error unless or until you edit the name in «L».
When defining «L» using handles, you should define an index as a List (not a list-of-labels), and then enter the index identifiers into the first N cells. For the final cell, you can enter anything, but I often just enter the identifier for the array being transformed. When using a list-of-identifiers like this, it is essential to set the MetaOnly attribute to 1. With the Index node selected in the Diagram,
- open the Attribute panel,
- click the Attribute pulldown,
- select metaOnly and
- enter 1 into the panel.
This prevents it from evaluating the variables in the list, and makes sure that the identifiers (handles actually)rather than their values that get passed as the value of «L».
Preserving Array-abstraction
Most Analytica functions are array-abstractable, meaning that they adhere to the law of array-abstraction, a very powerful principle that, when composed properly, results in models that are array-abstractable. The law of array abstraction states that for any unit of computation, f(x,y,..)
, that computes a result based on inputs x,y,..., and which does not operate over index I, it holds that:
f(x[I=v],y[I=v],..) = f(x,y,...)[I=v]
The law ensures that we can add a dimension to the inputs of computational unit, while preserving a consistency with the results before the dimension was present: For any input slice having the original values, the corresponding output slice will compute the original value.
This principle becomes particular powerful from the fact that the composition of array-abstractable expressions is array-abstractable. Thus, as long as you emply only array-abstractable constructs, your entire will be fully array-abstractable. This enable you to rely on power analyses, including parametric and uncertainty analyses.
MdArrayToTable is an example of an array-abstractable function when you use it with a fixed number of indexes in «L». For example, you might list two indexes to be transformed (so that the index of «L» has length three, the last column to hold the value). If we later supply a three dimensional array, two of the dimensions are transformed to a relational table, the extra third dimension is array-abstracted, resulting in an array of relational tables. This transformation takes a while to get your head around -- but it is an exact application of the law of array-abstraction above.
A common practice is to use MdArrayToTable to transform all dimensions of an array into the relational table. This is done by surrounding MdArrayToTable with some general purpose code that iterates over all existing dimensions in some way. This application of MdArrayToTable creates a violation of the law of array-abstraction. The violation implies that you can no longer assume that any model that does this internally is consistent with the law of array-abstraction. This generalized transformation of arbitrary dimensions does require some special meta-constructs around MdArrayToTable. As long as you understand the implications of doing this, there are certainly legitimate uses of a full transformation, especially in points that interface with databases or programs external to Analytica.
Transforming all dimensions
It's often useful to convert all dimensions of an array into a 2D table -- even if it does mean the code is not fully array abstractable. In this case, any dimension that you add (remove) to «A» in the future will add (remove) a column from the 2D result, whereas in a purely array-abstractable function the introduction of a new dimension to «A» would become a new dimension of the result.
In Analytica 4.5 or higher, a full flattening is easily obtained using MdArrayToTable(A)
. By omitting «I» and «L», local indexes are created for you.
If you are using Analytica 4.4 or earlier, you need to create and define «I» and «L» yourself. The index «L» must have an element for each dimension of x, plus one extra element for the array's value:
Index Cols := Concat(IndexesOf(A),'Value')
The row index to have either the total number of elements in A, as in:
Index Rows := 1..Size(A)
Or if you only want to exclude rows that contain NULL or zero, use:
Index Rows := 1..SumAll(A<>Null and A<>0)
where SumAll is a user-defined function that sums x over all its indexes:
Function SumAll(x) := LocalAlias I:=IndexesOf(x) Do x := Sum(x,I); x
Now you can call the function as usual:
MdArrayToTable(A,Rows,Cols)
Enable comment auto-refresher