Relational tables and multiD arrays

Revision as of 23:28, 12 December 2015 by Bbecane (talk | contribs)

The MDArrayToTable() function “flattens” a multi-dimensional array into a two-dimensional relational table. When a simple relational transformation is desired, the table will have one row for each array element and only one column (the right-most column) containing those element values. Alternatively MDArrayToTable() can produce a fact table in which values occupy multiple columns divided over a specified index, referred to as the Value Index. Both of these methods are described in separate sections below. The third section applies to both kinds of tables and describes partial transformations in which the MDArrayToTable() function will only operate on a subset of indexes, leaving the rest in array form.

The MDTable() function does the inverse, creating a multi-dimensional array from a table of values. Viewing tabular results in a multi-dimensional form via MDTable() often provides informative new perspective on existing data.

Many external application programs, including spreadsheets and relational databases, are limited to two-dimensional tables. Thus, before transferring multi-dimensional data between these applications and Analytica, it might be necessary to convert between multi-dimensional arrays and two-dimensional tables.

MDArrayToTable(a, row, col) (pure relational transformation)

Transforms a multi-dimensional array, a, into a two-dimensional array (i.e., a table) indexed by row and col. The result contains one row along row for each element of a. Each column along col, except the far right column, represents a coordinate index from the array. Index columns are populated with the coordinate values for the element. The far right column contains the actual value of the element. Both row and col are optional, if either or both is omitted, a local index of the appropriate length is automatically created. The local index col contains handles to the array’s indexes for the coordinate columns.

If you provide the index row with the number of elements equal to size(a), the resulting table will contain all array elements. If the number of elements in row is equal to the number of nonempty elements of a, the resulting table will contain only the non-empty elements of the array. If the number of elements in row is anything else, an error will occur. The optional parameters omitNull and omitZero control what is considered to be empty. When you omit row, the local index is sized for the non-empty elements, or you can obtain all elements by setting both omitNull and omitZero to false.

Library: Array

Example: Starting with “Array_3x3x3” indexed by “Number”, “Letter”, and “Hue”

Number ▶
Letter ▼ 1 2 3
A 45 19 92
B 13 21 81
C 12 43 47

Hue = 'Red'


Number ▶
Letter ▼ 1 2 3
A 34 25 45
B 11 62 19
C 84 45 53

Hue = 'Green'


Number ▶
Letter ▼ 1 2 3
A 21 65 95
B 48 33 12
C 57 56 23

Hue = 'Blue'


Row := sequence(1, size(Array_3x3x3))
Col := ['Number', 'Letter', 'Hue', 'Value']
All but the last column

elements specify indexes from the array

The last column element contains the

name heading of the value column. It does not specify an index

MDArrayToTable(Array_3x3x3, Row, Col) →

Col ▶
Row ▼ Number Letter Hue Value
1 1 A Red 45
2 1 A Green 34
3 1 A Blue 21
4 1 B Red 13
...
26 3 C Green 53
27 3 C Blue 23

The resulting table populates Index columns with the corresponding index labels for the array element.

Set the optional parameter positional to true to return the index positions rather than index labels:

MDArrayToTable(Array_3x3x3, Row, Col, positional: true) →

Col ▶
Row ▼ Number Letter Hue Value
1 1 1 1 45
2 1 1 2 34
3 1 1 3 21
4 1 2 1 13
...
26 3 3 2 53
27 3 3 3 23

Column headings can be customized using a one-dimensional variable to associate headings and indexes:

Index Headings := ['X', 'Y', 'Z', 'Values']

Variable Index2Heading := Table(Headings) ('Number', 'Letter', 'Hue', 'Anything')

MDArrayToTable(Array_3x3x3, Row, Index2Heading, positional: true) →

Col ▶
Row ▼ X Y Z Values
1 1 1 1 45
2 1 1 2 34
3 1 1 3 21
4 1 2 1 13
...
26 3 3 2 53
27 3 3 3 23

MDArrayToTable(a, row, col, valueIndex) (fact table transformation)

The MDArrayToTable() function creates a fact table whenever the optional valueIndex parameter is used. This format allows you to have multiple columns of values divided along one of the original array indexes, the designated valueIndex. The headings of each value column are typically the elements of the value index. In this type of transformation, the value index is not considered to be a positional coordinate of the original array. Instead, the array is interpreted as having a reduced coordinate system spanned only by the remaining indexes, with each coordinate location containing multiple values.

Example: Starting with the same array from the example above, you can specify Hue as the value index. The array would be considered to have only two remaining coordinate dimensions: Number and Letter. This changes the effective size of the array, and therefore the length of the row index:

Rows := 1..Sum(1, Number, Letter) or equivalently: Row := 1..9

The Col parameter includes index columns as before, but the value column now contains the elements of the value index:

Cols := Concat(['Number', 'Letter'], Hue) or equivalently:
Cols := ['Number', 'Letter', 'Red', 'Green', 'Blue']
First column elements specify

remaining coordinate indexes in the array.

The right-most n column elements

are elements of the value index, where n is the size of the value index. (In this example Hue has three elements)

If you omit the Col parameter, MDArrayToTable will create a local index for you named .Col, using the elements of Hue as the last there elements of the local index.

Include Hue as the ValueIndex parameter:

(Array_3x3x3, Rows, Cols, Hue) →

Cols ▶
Rows ▼ Number Letter Red Green Blue
1 1 A 45 34 21
2 1 B 13 11 48
3 1 C 12 84 57
4 2 A 19 25 65
...
8 3 B 81 19 12
9 3 C 47 53 23

MDArrayToTable() (partial transformation)

In both examples above, the MDArrayToTable() function completely “flattened” the entire array into a two-dimensional table. But every array can be regarded as a stack of smaller sub-arrays, each having one less dimension than the whole. If MDArrayToTable() operates on only one of these sub-arrays, the result will also be a flat two-dimensional table. If it operates on each of the sub-arrays individually, the result will be a stack of tables which can no longer be described as completely “flat”. Extending this idea, it is possible to imagine a plane of tables or a cube of tables if the original array starts off with enough dimensions. This essentially describes the idea of a partial transformation. The MDArrayToTable() function is not obliged to operate on all available array indexes. In fact there are often advantages to leaving some dimensions intact since they can still be analyzed using Analytica’s array abstraction features.

Specifying which dimensions are to be preserved in an MDArrayToTable() transformation involves a specialized coding technique described by some advanced users in New Jersey as, “fugetaboutit!” If you simply forget about certain indexes and remove all references to them in the MDArrayToTable() function and its parameters, the result will be an array of tables with dimensions along the omitted indexes. This is true whether or not the ValueIndex parameter is used. In any case the length of the Row parameter must be equal to the number of elements (or mutually non-empty elements) in each of the flattened sub-arrays.

MDTable(T, rows, cols, vars, conglomerationFn, defaultValue, ValueColumn)

Returns a multi-dimensional array from a two-dimensional table of values. If n is the total number of columns and m is the number of value columns in the table T, the following structure is assumed: The first n-m columns of T specify coordinates, and the right-most m columns contain data values. When there is only one data value in each row (a single value column) the table is described as a pure relational table. When there are multiple value columns it is described as a fact table.

rows and cols specify the vertical and horizontal indexes of the two-dimensional table. The length of the rows index is equal to the number of data records in the table. The length of cols is equal to the total number of columns.

The optional parameter ValueColumn specifies the index over which multiple value columns are divided in a fact table. If ValueColumn is omitted, the table is assumed in pure relational format, having a single value column.

The parameter vars is a list of index identifiers specifying the coordinate dimensions of the final array. It is optional if the table is in pure relational format (single value column). If vars is omitted, the dimensions of the final result are specified by the first n-1 elements of cols, where n is the number of elements in cols.

If a fact table (multiple value columns) is being transformed then you must use vars to specify the coordinate indexes of the array. The elements of vars must correspond to the coordinate columns of the table or an error will result. The number of elements in vars should be equal to n-m where n is the number of elements in cols and m is the number of value columns. Note that the list of coordinate indexes does not include the value index.

It is possible that two or more rows of T specify identical coordinates. In this case, a conglomeration function is used to combine the values for the given cell. The conglomerationFn parameter is a text value specifying which conglomeration function is to be used. Possible values are "sum" (default), "min", "max", "average", and ,"product".

It is also possible that no row in t corresponds to a particular cell. In this case, the cell value is set to defaultValue, or if the defaultValue parameter is omitted, the cell value is set to undefined. Undefined values can be detected using the IsUndef() function.

Library: Array

Example: Starting with the fact table produced in the previous section:

Cols ▶
Rows ▼ Number Letter Red Green Blue
1 1 A 45 34 21
2 1 B 13 11 48
3 1 C 12 84 57
4 2 A 19 25 65
...
8 3 B 81 19 12
9 3 C 47 53 23

Index rows := 1..9

Index cols := ['Number', 'Letter', 'Red', 'Green', 'Blue']

Index Hue := ['Red', 'Green', 'Blue']

MDTable(T, rows, cols, ['Number', 'Letter'],ValueColumn: Hue) →

Number ▶
Letter ▼ 1 2 3
A 45 19 92
B 13 21 81
C 12 43 47

Hue = 'Red'


Number ▶
Letter ▼ 1 2 3
A 34 25 45
B 11 62 19
C 84 45 53

Hue = 'Green'


Number ▶
Letter ▼ 1 2 3
A 21 65 95
B 48 33 12
C 57 56 23

Hue = 'Blue'


Example: Modifying the table by changing the (1,C) coordinate to (1,B), it now contains a missing value at (1,C) and redundant values at (1,B):

Cols ▶
Rows ▼ Number Letter Red Green Blue
1 1 A 45 34 21
2 1 B 13 11 48
3 1 B 12 84 57
4 2 A 19 25 65
...
8 3 B 81 19 12
9 3 C 47 53 23

MDTable(T,rows,cols,['Number','Letter'],'average','N/A',Hue) →

Number ▶
Letter ▼ 1 2 3
A 45 19 92
B 12.5 21 81
C N/A 43 47

Hue = 'Red'

Number ▶
Letter ▼ 1 2 3
A 34 25 45
B 47.5 62 19
C N/A 45 53

Hue = 'Green'

Number
Letter 1 2 3
A 34 25 45
B 47.5 62 19
C N/A 45 53

Hue = 'Blue'

Number ▶
Letter ▼ 1 2 3
A 45 19 92
B 12.5 21 81
C N/A 43 47

class="wikitable" Number ▶ Letter ▼ 1 2 3 A 34 25 45 B 47.5 62 19 C N/A 45 53 class="wikitable" Number ▶ Letter ▼ 1 2 3 A 21 65 95 B 52.5 33 12 C N/A 56 23 Hue = 'Red' Hue = 'Green' Hue = 'Blue'


Number ▶
Letter ▼ 1 2 3
A 45 19 92
B 12.5 21 81
C N/A 43 47

Hue = 'Red'


Number ▶
Letter ▼ 1 2 3
A 34 25 45
B 47.5 62 19
C N/A 45 53

Hue = 'Green'


Number ▶
Letter ▼ 1 2 3
A 21 65 95
B 52.5 33 12
C N/A 56 23

Hue = 'Blue'


MDTable() (partial transformation)

If the input is a partially flattened array of tables, no special action is necessary. The pre-existing dimensions will automatically be rolled up into the final array without any reference being made to them. This is consistent with the general principles of array abstraction in Analytica.

See Also

Comments


You are not allowed to post comments.