MdArrayToTable


MdArrayToTable(a, row, col, valueIndex, positional, omitZero, omitNull)

Transforms a multi-dimensional array, «a», into a two-dimensional relational table. A relational table contains a row corresponding to each cell of «a» and a column for each index of «a». Each row shows the values of each index identifying that cell, plus a final column 'Value' with the cell value. By default, it generates local indexes .Row and .Col, unless you specify global indexes as «row» or «col» parameters.

By default, it omits rows for any cells of «a» that are empty, i.e. contain 0 or Null. Hence, a relational table is a more compact representation for sparse arrays in which a large fraction of cells are empty. On the other hand, the standard Analytica array representation is usually more compact and easier to handle for arrays with few or no empty cells.

All parameters except the first are optional.

Use the optional «row» and «col» parameters to provide the indexes for the final result explicitly. When you omit these, the function does not abide by the law of array-abstraction (see Preserving Array-abstraction).

The «col» parameter

The «col» parameter is an index specifying the columns of the result. If you omit «col», it automatically creates a local index, called .Col, which includes all the indexes of the array, plus 'Value' that contains the measure of each cell of the array. When omitted, it is similar to you having specified the parameter as a index defined as Index Col := Concat(IndexesOf(A), 'Value').

Reasons to specify «col» are when you want to control the sequence of the columns that contain the first indexes (other than the «valueIndex») or when you want to use only a subset of the indexes. When you omit one or more indexes, it flattens array «a» only over the indexes that appear in «col». The other indexes are array abstracted, and are retained as indexes in the result.

When you specify «col», each of the first n elements of «col» identifies an index of «a», as a handle to the index or as the textual identifier of the index. It's safer to use handles in case the identifier of an index gets changed. If you want to have text values rather than handles or identifiers in the «col» index of the result, you can provide «col» as a 1-D array, whose index contains the desired text values (with 'Value' or any text value in the last item), and whose cells contain the index handles (or identifiers).

The «row» parameter

The «row» parameter is the index for rows in the result. If you omit the parameter, the result has a local index .Row, with an element for each combination of the index values of «a».

Sparse Tables with «omitZero» or «omitNull»

The result normally excludes any rows whose original cell values are 0 or NULL , unless you specify «omitZero» or «omitNull» as False, respectively. It will also include all the rows, including those with 0 or Null values, if you specify «row» as an index whose length is equal to Size(«a»).

If you want to include zeroes, but not Nulls, then size «row» for the number of non-Null values and specify:

MdArrayToTable(..., omitZero: false)

Or, if you want to include Null cells as rows, but not zeros, specify:

MdArrayToTable(..., omitNull: false)

Fact tables and «valueIndex»

MdArrayToTable() can also produce a fact table -- an extension of a relational table, in which each row, shows the the values of all the cells over one index of «a» instead of a single 'Value' cell. You specify the index to use as the «valueIndex» parameter. A fact table is 2D, like a relational table, but it is more compact, reducing the number of rows by a factor equal to the size of «valueIndex». The first Size(Col) - Size(«valueIndex») columns of «col» are then the coordinates, and the final Size(«valueIndex») columns of «col» hold the values.

If you specify the «col» parameter, it must contain all (or some) of the indexes of «a» excluding the «valueIndex», concatenated with the values of the «valueIndex». If «a» has n indexes, including the «valueIndex», «col» contains up to n-1 + k elements, where k is the number of elements in «valueIndex». You could generate this index Concat(#SetDifference(\IndexesOf(A), \Handle(valueIndex)), valueIndex). If you omit the «col» parameter, it does this automatically.

A fact table normally omits rows whose values along the «valueIndex» are all zero or Null, unless you specify «omitZero» or «omitNull» as False.

Positional Coordinates

MdArrayToTable(..., positional: true)

Normally, the first N columns of the result contain coordinates using index labels. When you specify parameter «positional» as true, it returns the positions of each index value rather than its label.

Customizing Column Names

Normally, the column labels in «col» are just the index identifiers. However, you may need to use different labels, such as labels to match column names in an external database, or more human-readable names. To do this, define an index, say Column, with the desired labels. Then create a Table indexed by Column, call it L, and fill in the index names in the table. Pass L as the third parameter to MdArrayToTable

Examples

Define these global objects (as indexes or variables):

Rows := sequence(1, size(Cost_in_time))
Cols := ["Mpg", "Time", "Car_type", "Cost"]
MDArrayToTable(Cost_in_time, Rows, Cols) →
Cols ▶
Rows ▼ Mpg Time Car_type Cost
1 26 0 VW 2185
2 26 0 Honda 2385
3 26 0 BMW 3185
4 26 1 VW 2294
5 26 1 Honda 2314
6 26 1 BMW 3294
7 26 2 VW 2409
...
45 35 4 BMW 5175

Note: The expression for doing the same transformation using local indexes looks like

Index Rows := Sequence(1, Size(Cost_in_time));
Index Cols := ["Mpg", "Time", "Car_type", "Cost"];
MdArrayToTable(Cost_in_time, Rows, Cols)

Positional example

Changing the previous example to:

MdArrayToTable(Cost_in_time, Rows, Cols, positional: true)

causes the coordinates in the first 3 columns to be expressed as index positions, rather than index labels.

Cols ▶
Rows ▼ Mpg Time Car_type Cost
1 1 1 1 2185
2 1 1 2 2385
3 1 1 3 3185
4 1 2 1 2294
5 1 2 2 2314
6 1 2 3 3294
7 1 3 1 2409
...
45 3 5 3 5175

Creating a Fact Table

In this example, we will provide a separate column for each car-type, rather than including that as a relational dimension. The numbers in the columns are the cost.

Index Row := 1..Sum(1, Mpg, Time)
Index Label := [Mpg, Time, 'VW', 'Honda', 'BMW'];
Variable Cost := MdArrayToTable(Cost_in_time, Row, Label, valueIndex: Car_type)
Cost →
Cols ▶
Rows ▼ Mpg Time VW Honda BMW
1 26 0 2185 2385 3185
2 26 1 2294 2314 3294
3 26 2 2409 2512 3359
...
15 35 4 3829 4230 5175

Details

Textual Index Names vs. Handles

You can specify indexes for the first N columns of «col» using text index names, or using handles. Which is better? The advantage of using handles is that, if you rename one of the indexes, it will automatically update the name in the definition, as it does in other definitions referring to that identifier. If you use text names, it will cause an "Unidentified identifier" error unless or until you edit the name in «col».

When defining «col» using handles, you should define an index as a List (not a list-of-labels), and then enter the index identifiers into the first N cells. For the final cell, you can enter anything, but I often just enter the identifier for the array being transformed. When using a list-of-identifiers like this, it is essential to set the MetaOnly attribute to 1. With the Index node selected in the Diagram,

  1. open the Attribute panel,
  2. click the Attribute pulldown,
  3. select metaOnly and
  4. enter 1 into the panel.

This prevents it from evaluating the variables in the list, and makes sure that the identifiers (handles actually)rather than their values that get passed as the value of «col».

Preserving Array-abstraction

Most Analytica functions are array-abstractable, meaning that they adhere to the law of array-abstraction, a very powerful principle that, when composed properly, results in models that are array-abstractable. The law of array abstraction states that for any unit of computation, f(x, y,..), that computes a result based on inputs x, y,..., and which does not operate over index I, it holds that:

f(x[I = v], y[I = v], ..) = f(x, y, ...)[I = v]

The law ensures that we can add a dimension to the inputs of computational unit, while preserving a consistency with the results before the dimension was present: For any input slice having the original values, the corresponding output slice will compute the original value.

This principle becomes particular powerful from the fact that the composition of array-abstractable expressions is array-abstractable. Thus, as long as you emply only array-abstractable constructs, your entire will be fully array-abstractable. This enable you to rely on power analyses, including parametric, sensitivity, uncertainty, and scenario analyses.

MdArrayToTable is an example of an array-abstractable function when you specify «row» and «col» parameters explicitly with a fixed number of indexes in «col». For example, you might list two indexes to be transformed (so that the index of «col» has length three, the last column to hold the value). If we later supply a three dimensional array, two of the dimensions are transformed to a relational table, the extra third dimension is array-abstracted, resulting in an array of relational tables. This transformation takes a while to get your head around -- but it is an exact application of the law of array-abstraction above.

It is common to use MdArrayToTable() to transform all dimensions of an array into the relational table, often by omitting the «row» and «col» parameters. This method violates the law of array-abstraction -- and may cause problems when used for parametric or sensitivity analysis. But there are certainly legitimate uses of a full transformation -- for example when you interface with databases, spreadsheets, or programs external to Analytica that can handle only arrays with only one or two dimensions.

History

The optional parameters «Row» and «Col» were required before Analytica 4.5 and were previously named «I» and «L».

See Also

Comments


You are not allowed to post comments.