MdTable
Release: |
4.6 • 5.0 • 5.1 • 5.2 • 5.3 • 5.4 • 6.0 • 6.1 • 6.2 • 6.3 • 6.4 • 6.5 |
---|
MdTable(t, rows, cols,vars, conglomerationFn, defaultValue, valueColumn)
MdTable converts a 2D relational table, «t», indexed by «rows» and «cols», into an N-dimensional array result. It is analogous to PivotTable in Excel. It is the inverse of MdArrayToTable().
Suppose «cols» has N elements. In the standard case, the first N-1 elements contain the identifier of an index (or a Handle to an index), and the last column contains a numerical value (a "measure" in database terminology). It returns an array with N-1 dimensions, corresponding to the N-1 indexes. Each cell contains the sum (or other «conglomerationFn») over the numerical values for all rows of «t» whose values match the index values in first N-1 «cols». If the index values of a cell in the result array match no row in «t», the result cell contains «defaultValue» (default NULL).
![]() |
⇒ |
| ||||||||||||||||||||
A relational table on the left is the input to MdTable. The array on the right is the result. |
If the values in the first N-1 «cols» are identifiers of or handles to the desired indexes, you can specify the list of identifers or handles to indexes in the optional parameters «vars». You may need to create indexes for some or all of them. You can use the Unique function to define each index as the unique set of values from the corresponding column.Or you can save that effort by using Smart_MdTable from the Smart_MdTable library, which automatically converting each of the first N-1 «cols» into a local index with the unique values from the corresponding column.
MdTable can also handle a relational table that has multiple values ("measure" columns) -- a "fact table" in database terminology. In this case, last M elements of the «cols» index contain these measures. And the first N-M columns contain the coordinates of the indexes. You specify «valueColumn» parameter as the index over last M value columns. Then the resulting array has N-M+1 indexes -- corresponding to the first N-M columns plus the «valueColumn» index.
Optional parameters
vars
If «Cols» already contains identifiers of (or handles to) the Index variables containing the unique values in each column, you can omit «vars». Otherwise, you should specify «vars» as a list of the identifiers or Handles to these indexes. You don't need to include the last column(s) in «vars», which it assumes contains the values. Each index in «vars» becomes an index of the result. If you want to refer to an existing local index, you must use a Handle, because text identifiers can only refer to a global variable. It's safer to use Handles for all the Indexes, so that the model won't break if someone changes the identifier of one of the Indexes. .
You should make sure that all these Indexes have as values all the unique values from the corresponding column in «T», for example:
Index Cols := ['A', 'B', 'Value']
Index Rows := 1..100
Variable T := Table(Cols, Rows)(.....)
Index A := Unique(T[Cols = 'A'], Rows)
Index B := Unique(T[Cols = 'B'], Rows)
Variable Result := MdTable(T, Rows, Cols)
It's safer to use «vars» using handles rather than the text identifiers of each index:
INDEX Indexes := ListOfHandles(A, B)
Variable Result := MDArray(T, Rows, Cols, Indexes)
In this case, the model will not break if someone changes the Identifier of A
or B
, since these automatically propagate through Indexes
.
conglomerationFn
The «conglomerationFn» function combines the values when two or more rows of «T» have the same coordinates (values in their index columns). The default is "sum". You can specify other options as a text identifier or handle, including: "min"
, "max"
, "average"
, "count"
, "product"
, "First"
, "Last"
or any Array-reducing functions that operates over an index, with parameters of the form: (A: Array[I]; I: Index)
. It is OK if it has other parameters as long as they are optional.
You can also create your own custom conglomerationFn function as a UDF.
The "count"
method ignores the contents of the value column(s), so is equivalent to "Sum"
when the value column is one everywhere. Unless you explicitly specify the default value, "count"
uses 0, whereas other methods default to Null.
defaultVal
(Default: NULL) The value of a result cell that has no corresponding rows in «T». It is often a good idea to set the default to 0.
valueColumn
In OLAP terminology, a fact table is a table in which the first N-M columns specify coordinates in a multi-dimensional cube, and the last M columns specify measures along a measure dimension. Each row has multiple values across this measure dimension. In this case, MdTable() generates a multi-dimensional array with N-M+1 dimensions, the extra dimension being the measure dimension. You specify the measure dimension in the «valueColumn» parameter, usually as an Index, but it can be a 1-D array, indexed by your measure index. The size of the «valueColumn» index must be M. In this case, «vars» identifies as indexes only the first N-M elements of «Cols». It maps the remaining M «Cols» into «valueColumn», which is also an index of the result.
Library
Array / Dimensional transformations
Examples
Suppose T
, Rows
, and Cols
are defined like this:
Cols ▶ Rows ▼ Car_type Mpg X 1 VW 26 2185 2 VW 30 1705 3 Honda 26 2330 4 Honda 35 2210 5 BMW 30 2955 6 BMW 35 2800 7 BMW 35 2870
The first example requires Analytica 6.3 or greater. It automatically creates local indexes for all of the result indexes for you.
MdTable(T, Rows, Cols, '.' & Cols, 'average')→
.Mpg ▶ .Car_type ▼ 26 30 35 BMW Null 2955 2835 Honda 2330 Null 2210 VW 2185 1705 Null
This next example assumes you've created indexes Car_type
and Mpg
in advance. Since you created them yourself, you have control over the ordering of the index elements.
MdTable(T, Rows, Cols, [Car_type, Mpg], 'average', 'n/a') →
Mpg ▶ Car_type ▼ 26 30 35 VW 2185 1705 n/a Honda 2330 n/a 2210 BMW n/a 2955 2835
Cells with no corresponding rows in T
containing n/a
. Rows 6 and 7 in T
both specify values for Car_type = BMW, Mpg = 35
, which are combined by the "average" «conglomerationFn» function.
You can also create global indexes for the result, but let it set the index values for you (requires Analytica 6.3 or later).
Index Mpg := ComputedBy(T)
Index Car_type := ComputedBy(T)
MdTable(T, Rows, Cols, [Car_type, Mpg], 'average') →
.Mpg ▶ Car_type ▼ 26 30 35 BMW Null 2955 2835 Honda 2330 Null 2210 VW 2185 1705 Null
MdTable can also work with a User-Defined function for «conglomerationFn», provided it is an Array-reducing functions that operates over an index. Suppose we define this Function First
that returns the first element of an array over an index:
Function First( A : Array[I] ; I : Index ) := A[@I = 1]
We can then use it as the conglomerationFn function:
MdTable(T, Rows, Cols, [Car_type, Mpg], 'First', 'n/a') →
Mpg ▶ Car_type ▼ 26 30 35 VW 2185 1705 n/a Honda 2330 n/a 2210 BMW n/a 2955 2800
To aggregate X
over Car_type
, we can use the «valueColumn». Here we will aggregate using Sum.
MdTable(T, Rows, Cols, [Car_type], valueColumn: 'X') →
Car_Type ▼ VW 3890 Honda 4540 BMW 8625
In the previous example, Car_type
is the first column. When you aggregate in this fashion, the target aggregation index must be the first column. If you wanted to aggregate onto Mpg
(summing all records with the same Mpg
) then you would need to re-index first to make Mpg
the first column like this:
Index L := ['Mpg', 'X'];
MdTable(T[Cols = L], Rows, L, [mpg], valueColumn: 'X') →
Mpg ▼ 26 4515 30 4660 35 7880
If T
had 6 coordinate columns and you wanted to aggregate onto 3 dimensions only, then you'd need to make sure that the three final dimensions were in the first three columns. If they were not there initially, then you'd reindex as in the previous example. If you are aggregating only a single target dimension, the Aggregate function can also be used and may be more intuitive. MdTable is actually more general since you can aggregate onto a multi-dimensional table.
To use both Mpg
and X
as value columns, we can define a measure dimension:
Index Measure_Index := ['Mpg', 'X']
Then
MdTable(T, Rows, Cols, [Car_type], valueColumn: Measure_Index) →
MeasureIndex ▶ Car_Type ▼ Mpg X VW 56 3890 Honda 61 4540 BMW 100 8625
Notice here that both Mpg
and X
have been summed -- both values used the same conglomeration function. However, suppose we want the average value for Mpg
, but the maximum value for X
, i.e., each "measure" having its own conglomeration function. We can accomplish this using:
MdTable(T, Rows, Cols, [Car_type], valueColumn: Measure_Index,
conglomerationFn: Array(Measure_Index, ["average", "max"])) →
MeasureIndex ▶ Car_Type ▼ Mpg X VW 28 2185 Honda 30.5 2330 BMW 33.3 2955
Fact Table
In order to convert a 2D relational table with more than one value per combination of indexes, you would use the parameter «valueColumn» to create a Fact Table. For example, suppose T
, Rows
, and Cols
are defined as indicated by the following table:
Cols ▶ Rows ▼ Car_type Mpg X Y 1 VW 26 2185 1 2 VW 30 1705 2 3 Honda 26 2330 3 4 Honda 35 2210 3 5 BMW 30 2955 4 6 BMW 35 2800 5 7 BMW 35 2870 5
And suppose Fact
is an index defined as ['X', 'Y']
. Therefore:
MDTable(T, Rows, Cols, [Car_type, Mpg], valueColumn: Fact, defaultValue: "n/a") →
MPG = 26 MPG = 30 MPG = 35 Fact ▶ Car_type ▼ X Y VW 2185 1 Honda 2330 3 BMW n/a n/a Fact ▶ Car_type ▼ X Y VW 1705 2 Honda n/a n/a BMW 1955 4 Fact ▶ Car_type ▼ X Y VW n/a n/a Honda 2210 3 BMW 5670 10 Notice that in the example, Rows 6 and 7 both specified values for
Car_type = BMW, Mpg=35
. By default the Sum conglomerationFn function was used to combine these.History
Introduced in Analytica 4.0.
Some conveniences added in Analytica 6.3:
- Automatic setting of global result indexes when they are defined with ComputedBy.
- Auto-creation of local result indexes then given a textual index name starting with a dot (
'.'
) - Ignore columns when the corresponding entry in «vars» is Null.
- Ability to use a 'Count' agglomeration without any value column. Done by setting «valueColumn» to Null, or by listing all indexes in a list to «vars».
- «vars» can be indexed by «Cols». In this case, «valueColumn» can be any column or 1-D array of columns (i.e., the index columns don't have to come first).
- The preferred name for the 5th parameter was changed from «conglomerationFn» to «type».
See Also
- MdArrayToTable inverse of MdTable
- Smart_MdTable library
- Relational tables and multiD arrays
- Unflatten
- Unique
- Aggregate
Enable comment auto-refresher