MdArrayToTable


MdArrayToTable(A,I,L)

Transforms a multi-dimensional array, A, into a two-dimensional array (i.e., a relational table) indexed by I and L. The result contains one row along I for each element of A. L must contain a list of names of the indexes of A, followed by one final element. All elements of L must be text values. The column corresponding to the final element of L contains the cell value. If L does not contain all the indexes of A, array abstraction will create a set of tables indexed by the dimensions not listed in L.

Creating Sparse Relational Tables

You can elect to include only the non-null / non-zero elements of «A» in the resulting table. Normally, when you want all elements, you provide an index «I» with a length equal to the total number of cells in the array. To include only the non-zero / non-null elements, provide an index with a length equal to the number of non-null / non-zero elements.

New to Analytica 4.2: Prior to 4.2, only zero elements were excluded. The ability to exclude Null elements is new. The «omitZero» and «omitNull» parameters are new to 4.2.

If you want to include zeroes, but not Nulls, then size «I» for the number of non-Null values and specify:

MdArrayToTable(...,omitZero:false)

Likewise, if you want to include Null cells as rows, but not zeros, specify:

MdArrayToTable(...,omitNull:false)

Creating a fact table

New to Analytica 4.2

MdArrayToTable(A,I,L,valueIndex)

A fact table is a generalization of a simple relational table. In a fact table, the first N columns identify cell coordinates, and the last K columns hold measured values (often called measures, facts, or just values). This generalized the simple relational table where only the single final row holds a value.

The optional parameter, «valueIndex», specifies an index of «A» that is used for the value columns in the result. The first Size(L) - Size(valueIndex) columns of «L» are then the coordinates, and the final Size(valueIndex) columns of «L» hold the values.

When omitting cells having no data, all values in «A» along the valueIndex must be zero or null. If any of the values have data, the corresponding cell (i.e., row in the result) is considered to have data.

Positional Coordinates

New to Analytica 4.2

MdArrayToTable(...,positional:true)

Normally, the first N columns of the result specify coordinates using index labels. When the optional parameter «positional» is specified as true, index positions are returned rather than labels.

Customizing Column Names

Normally, the column labels in «L» are just the index identifiers. However, you may need to use different labels, such as labels to match column names in an external database, or more human-readable names. To do this, define an index, say Column, with the desired labels. Then create a Table indexed by Column, call it L, and fill in the index names in the table. Pass L as the third parameter to MdArrayToTable.

Library

Array

Examples

Define these global objects (as indexes or variables):

Rows := sequence(1,size(Cost_in_time))
Cols := [‘Mpg’,’Time’,’Car_type’,’Cost’]
MDArrayToTable(Cost_in_time,Rows,Cols) →
Rows v, Cols >>
Mpg Time Car_type Cost
1 26 0 VW 2185
2 26 0 Honda 2385
3 26 0 BMW 3185
4 26 1 VW 2294
5 26 1 Honda 2314
6 26 1 BMW 3294
7 26 2 VW 2409
...
45 35 4 BMW 5175

Note: The expression for doing the same transformation using local indexes looks like

Index Rows := Sequence(1,Size(Cost_in_time))
Index Cols := [‘Mpg’,’Time’,’Car_type’,’Cost’]
MdArrayToTable(Cost_in_time,Rows,Cols)

Positional example

Changing the previous example to:

MdArrayToTable(Cost_in_time,Rows,Cols,positional:true)

causes the coordinates in the first 3 columns to be expressed as index positions, rather than index labels.

Rows v, Cols >>
Mpg Time Car_type Cost
1 1 1 1 2185
2 1 1 2 2385
3 1 1 3 3185
4 1 2 1 2294
5 1 2 2 2314
6 1 2 3 3294
7 1 3 1 2409
...
45 3 5 3 5175

Creating a Fact Table

In this example, well provide a separate column for each car-type, rather than including that as a relational dimension. The numbers in the columns are the cost.

Index Row := 1..Sum(1,Mpg,Time)
Index Label := [Mpg,Time,'VW','Honda','BMW'];
Variable Cost := MdArrayToTable(Cost_in_time,Row,Label,valueIndex:Car_type)


Cost →
Rows v, Cols >>
Mpg Time VW Honda BMW
1 26 0 2185 2385 3185
2 26 1 2294 2314 3294
3 26 2 2409 2512 3359
...
15 35 4 3829 4230 5175

Tips and Tricks

Textual Index Names vs. Handles

You can specify indexes for the first N columns of «L» using textual index names, or using handles. Which is better?

Personally, I like to use handles, since this allows Analytica to automatically re-write the definition if I rename one of the indexes later. If I simply enter a literal textual name, and later change the index identifier, the literal text remains the same.

When defining «L» using handles, you should define an index as a List (not a list-of-labels), and then enter the index identifiers into the first N cells. For the final cell, you can enter anything, but I often just enter the identifier for the array being transformed. When using a list-of-identifiers like this, it is critical that you set the metaOnly attribute to 1. In the attribute panel, click the attribute pulldown and select metaOnly and enter 1 in the panel. This prevents the identifiers from being evaluated, ensuring that it is the identifiers themselves (handles actually) that comprise the mid value when the index gets passed in as «L».

Preserving Array-abstraction

Most Analytica functions are array-abstractable, meaning that they adhere to the law of array-abstraction, a very powerful principle that, when composed properly, results in models that are array-abstractable. The law of array abstraction states that for any unit of computation, f(x,y,..), that computes a result based on inputs x,y,..., and which does not operate over index I, it holds that:

f(x[I=v],y[I=v],..) = f(x,y,...)[I=v]

The law ensures that we can add a dimension to the inputs of computational unit, while preserving a consistency with the results before the dimension was present: For any input slice having the original values, the corresponding output slice will compute the original value.

This principle becomes particular powerful from the fact that the composition of array-abstractable expressions is array-abstractable. Thus, as long as you emply only array-abstractable constructs, your entire will be fully array-abstractable. This enable you to rely on power analyses, including parametric and uncertainty analyses.

MdArrayToTable is an example of an array-abstractable function when you use it with a fixed number of indexes in «L». For example, you might list two indexes to be transformed (so that the index of «L» has length three, the last column to hold the value). If we later supply a three dimensional array, two of the dimensions are transformed to a relational table, the extra third dimension is array-abstracted, resulting in an array of relational tables. This transformation takes a while to get your head around -- but it is an exact application of the law of array-abstraction above.

A common practice is to use MdArrayToTable to transform all dimensions of an array into the relational table. This is done by surrounding MdArrayToTable with some general purpose code that iterates over all existing dimensions in some way. This application of MdArrayToTable creates a violation of the law of array-abstraction. The violation implies that you can no longer assume that any model that does this internally is consistent with the law of array-abstraction. This generalized transformation of arbitrary dimensions does require some special meta-constructs around MdArrayToTable. As long as you understand the implications of doing this, there are certainly legitimate uses of a full transformation, especially in points that interface with databases or programs external to Analytica.

Transforming all dimesions

There are applications where it is useful to transform all dimensions of an array, in such a way that if new dimensions are added in the future, the resulting relational table will adapt to have more columns to accomodate those new dimensions. The previous sub-section above explains how doing this violates the principle of array-abstraction. However, there are some good reasons for wanting to do this. Here I show how this is done.

To transform all dimensions of an array, the index of «L» must have an element for each dimension, plus one additional element for the array's value. Hence, «L» would be defined as:

Index L := Concat(IndexesOf(A),'Value')

Next, you must define the row index to have either the total number of elements, or the number of non-null (or non-zero) elements. For the dense transformation it is easy:

Index I := 1..Size(A)

For the sparse transformation, you must count the number of non-null or non-zero elements. This is done as:

Index I := 1..SumAll(A<>Null and A<>0)

where SumAll is a user-defined function as follows:

Function SumAll(x) := 
   LocalAlias I:=IndexesOf Do x := Sum(x,I);
   x

With these defined, the call is the usual:

MdArrayToTable(A,I,L)

See Also

Comments


You are not allowed to post comments.