Difference between revisions of "MdTable"

(Example with separate aggregation functions)
m (removed TBD)
Line 9: Line 9:
 
Note: the optional valueColumn parameter, and thus the ability to have ''M>1'', is new to 4.0.
 
Note: the optional valueColumn parameter, and thus the ability to have ''M>1'', is new to 4.0.
  
The dimensions of the final result are given by the optional parameter ''Vars''. ''Vars'' must be a list of index identifiers or index names.  (Either identifiers, [[VarTerm]]s, or strings).  If an element contains an index name as a string, the index must be a global index (i.e,. not a [[Local Indexes|local index]].
+
The dimensions of the final result are given by the optional parameter ''Vars''. ''Vars'' must be a list of index identifiers or index names.  (Either identifiers, [[VarTerm]]s, or strings).  If an element contains an index name as a string, the index must be a global index (i.e,. not a [[Local Indexes|local index]]).
  
 
As an alternative to using the ''Vars'' parameter, you can specify the index names or identifiers for the final array as the first ''Size(Col)-1'' elements of the ''Col' index.  In this case, the ''Vars'' parameter does not need to be specified.
 
As an alternative to using the ''Vars'' parameter, you can specify the index names or identifiers for the final array as the first ''Size(Col)-1'' elements of the ''Col' index.  In this case, the ''Vars'' parameter does not need to be specified.
  
In OLAP terminology, a fact table is a table in which the first N columns specify coordinates in a multi-dimensional cube, and the last M columns specify ''measures'' along a ''measure dimension'', so that the final multi-dimensional array has ''N+1'' dimensions, the extra dimension being the measure dimension.  This transformation is accomplished by providing the measure dimension in the valueColumn parameter.  The valueColumn parameter should then be a 1-D array, indexed by your measure dimension, where the array elements identify columns in the Cols index.  (TBD: Insert example model)
+
In OLAP terminology, a fact table is a table in which the first N columns specify coordinates in a multi-dimensional cube, and the last M columns specify ''measures'' along a ''measure dimension'', so that the final multi-dimensional array has ''N+1'' dimensions, the extra dimension being the measure dimension.  This transformation is accomplished by providing the measure dimension in the valueColumn parameter.  The valueColumn parameter should then be a 1-D array, indexed by your measure dimension, where the array elements identify columns in the Cols index.   
  
 
Before using ''MDTable'', you must define all of the indexes for the result. Each index must include all values that occur in the corresponding column of ''T'' or an error will result. The [[Unique]]() function is useful for defining the necessary indexes.
 
Before using ''MDTable'', you must define all of the indexes for the result. Each index must include all values that occur in the corresponding column of ''T'' or an error will result. The [[Unique]]() function is useful for defining the necessary indexes.

Revision as of 20:14, 8 March 2007


MdTable(T,Rows,Cols,Vars,conglomerationFn,defaultValue,valueColumn)

Returns a multi-dimensional array from a two-dimensional table of values. T is a two-dimensional array (i.e., a table) indexed by Rows and Cols. Each row of T specifies a data point containing the coordinates of a cell in a multi-dimensional array, along with the value for that cell.

MdTable interprets the first N columns (along the Cols index) as containing a coordinate in the final multi-dimensional array, and last remaining M columns as containing values (in OLAP terminology, measures), where N+M = Size(Cols). In the most common usage, M=1 and N=Size(Cols-1), meaning that all columns except the last specify the coordinate, and the last column contains the cell value. In this case, the Vars and valueColumn parameters are optional. If the Vars parameter is specified, its length determines how many columns are used for coordinates, and if its length is less than Size(Cols)-1, then the valueColumn parameter must be specified to indicate which column contains the cell value.

Note: the optional valueColumn parameter, and thus the ability to have M>1, is new to 4.0.

The dimensions of the final result are given by the optional parameter Vars. Vars must be a list of index identifiers or index names. (Either identifiers, VarTerms, or strings). If an element contains an index name as a string, the index must be a global index (i.e,. not a local index).

As an alternative to using the Vars parameter, you can specify the index names or identifiers for the final array as the first Size(Col)-1 elements of the Col' index. In this case, the Vars parameter does not need to be specified.

In OLAP terminology, a fact table is a table in which the first N columns specify coordinates in a multi-dimensional cube, and the last M columns specify measures along a measure dimension, so that the final multi-dimensional array has N+1 dimensions, the extra dimension being the measure dimension. This transformation is accomplished by providing the measure dimension in the valueColumn parameter. The valueColumn parameter should then be a 1-D array, indexed by your measure dimension, where the array elements identify columns in the Cols index.

Before using MDTable, you must define all of the indexes for the result. Each index must include all values that occur in the corresponding column of T or an error will result. The Unique() function is useful for defining the necessary indexes.

It is possible that two or more rows of T specify identical coordinates. In this case, a conglomeration function is used combine the values for the given cell. The conglomFn parameter is a text value specifying which conglomeration function is to be used. Some possible values are: "sum" (default), "min", "max", "average", or "product". You can also create your own custom conglomeration function.

A congomeration function has a parameter declaration of: ( A : Array[I] ; I : IndexType ). Additional parameters are okay as long as they are optional. You can define a user-defined function of this form to perform the conglomeration for a cell when multiple coordinates map to the same cell. You then specify the function name (as a string) to the congomerationFn parameter.

It is also possible that no row in T corresponds to a particular cell. In this case, the cell value is set to missingval, or if the defaultVal parameter is omitted, the cell value is set to Null.

Library

Array

Examples

Download model with examples (requires Analytica 4.0)

Suppose T, Rows, and Cols are defined as indicated by the following table:

Cols →
Rows ↓ Car_type Mpg X
1 VW 26 2185
2 VW 30 1705
3 Honda 26 2330
4 Honda 35 2210
5 BMW 30 2955
6 BMW 35 2800
7 BMW 35 2870
MDTable(T,Rows,Cols,[Car_type,Mpg],'average','n/a') -->
Mpg →
Car_type ↓ 26 30 35
VW 2185 1705 n/a
Honda 2330 n/a 2210
BMW n/a 2955 2835

Notice that in the example, Rows 6 and 7 both specified values for Car_type=BMW, Mpg=35. The "average" conglomeration function was used to combine these.

Now suppose we have a User-Defined Function, First:

 Function First( A : Array[I] ; I : Index ) := A[@I=1]

Then

MdTable(T,Rows,Cols,[Car_type,Mpg],'First','n/a') -->
Mpg →
Car_type ↓ 26 30 35
VW 2185 1705 n/a
Honda 2330 n/a 2210
BMW n/a 2955 2800

To aggregate X over Car_type, we can use the valueColumn. Here we will aggregate using Sum.

MdTable(T,Rows,Cols,[Car_type],valueColumn:'X') →
Car_Type ↓
VW 3890
Honda 4540
BMW 8625

To use both Mpg and X as value columns, we can define a measure dimension:

Index Measure_Index := ['Mpg','X']

Then

MdTable(T,Rows,Cols,[Car_type],valueColumn:Measure_Index) →
MeasureIndex →
Car_Type ↓ Mpg X
VW 56 3890
Honda 61 4540
BMW 100 8625

Notice here that both Mpg and X have been summed -- both values used the same conglomeration function. However, suppose we want the average value for Mpg, but the maximum value for X, i.e., each "measure" having its own conglomeration. We can accomplish this using:

MdTable(T,Rows,Cols,[Car_type],valueColumn:Measure_Index,
        conglomerationFn: Array(Measure_Index,["average","max"]) ) →
MeasureIndex →
Car_Type ↓ Mpg X
VW 28 2185
Honda 30.5 2330
BMW 33.3 2955

See Also

Comments


You are not allowed to post comments.