Unique
Unique(a, I
, position, caseInsensitive, resultIndex, mapToUnique)
(In Analytica 5.0 or later) Without the index parameter «I»
, returns a list containing all the unique atoms in «a». The resulting list has any duplicates removed. «a» can be a multidimensional array, and the result contains the unique atoms (cells) across all dimensions.
When an index «I»
is specified, it returns a maximal subset of index «I»
such that each indicated slice of array «a» along «I»
is unique. This usage can be used to remove duplicate slices from an array, or to identify a single member of each equivalence class.
Note: Prior to Analytica 5.0, the «I» parameter was required.
Example
Let:
Variable DataSet :=
Field ▶ PersonNum ▼ LastName FirstName Company 1 Smith Bob Acme 2 Jones John Acme 3 Johnson Bob Floorworks 4 Smith Bob Acme
Then:
Unique(DataSet) → ['Smith', 'Jones', 'Johnson', 'Bob', 'John', 'Acme', 'Floorworks' ]
{Requires Analytica 5.0}Unique(DataSet, PersonNum) → [1, 2, 3]
Unique(DataSet[Field = 'Company'], PersonNum) → [1, 3]
Optional parameters
I
The index parameter «I
» is optional as of Analytica 5.0, but is most often specified. When specified, it finds the unique slices along «I
». When omitted, it finds the unique atomic values among all cells of «a».
Position
By default, Unique returns the elements of the index. Setting the optional parameter «position» equal true
(position: true
) will return the positions of the elements in «I»
, rather than the elements themselves (see Associative vs. Positional Indexing).
This parameter is not used when the «I»
index parameter is omitted.
CaseInsensitive
When applying Unique to text values, values are considered by default in a case-sensitive fashion, for example, "Apple" and "apple" are considered distinct elements.
Specifying caseInsensitive: true
ignores differences in upper and lower case in text values when determining if values are unique.
ResultIndex
When an index is provided for this parameter, the result is indexed by the given index instead of being an unindexed list. If the index supplied is shorter than the number of unique items, then only the first n items are returned, where n is the size of the index. When the result index is too long, the result is null-padded.
«ResultIndex» can be useful when you want to array abstract. For example, in a 2-D array a
, you may want to identify the unique items along I
separately for each item in index J
, as follows.
For jj := J Do Unique(a[J = jj], I, resultIndex: I)
Without the «resultIndex» parameter, each iteration would return a list, and the For loop would then need to combine multiple list (i.e., implicit dimensions), which would be disallowed and result in an error. By ensuring that each result has an explicit index -- I
in this example -- the results can be successfully combined.
It is also worth noting that the For loop example just given is not equivalent to
Unique(a, I, resultIndex: I)
The reason is that Unique(a, I)
compares entire slices -- it isn't operating over each slice of the exogenous dimensions separately as most other array functions do.
mapToUnique
(new to Analytica 5.0)
Unique(A,I,mapToUnique:true)
returns an array indexed by «I» which maps from each element of «I» back to the first element in A that has that same value. The first element with a given value maps to itself, and is the element that would have been returned if «mapToUnique» was not specified. For example:
- Index
I := 'a'..'e'
- Variable
A :=
Table(I)(4,2,4,3,2)
Unique(A, I, mapToUnique:true) → Array(I,['a','b','a','c','b'])
Unique(A, I, position:true, mapToUnique:true) → Array(I,[1,2,1,3,2])
- Index
One situation where this is useful is when you use Unique(A,I)
to find the unique slices, so that you can compute an expensive function only on the unique slices. But then you figure out which result to use for each of the other slices, which «mapToUnique» gives you.
Since «mapToUnique» returns the position along «I
», you must specify the index «I
» when using «mapToUnique».
Notes
The Set Functions such as SetDifference or SetUnion ensure that no duplicates exist in the final result, and hence can also be used to find the unique elements. For example, when L
is a 1-D array or list
#SetDifference(\L)
and
Unique(L)
do the same thing. Since no set is being subtracted, SetDifference returns the set \L
after duplicates are removed. In some instances, SetDifference has advantages over Unique. For example, if you also want to ignore certain values, including Null or others, you could compute the unique elements and then follow that with a call to SetDifference to remove the other values, but when so doing, you might as well skip the call to Unique entirely since SetDifference already does that for you.
The ordering of elements in the result follows the ordering of the elements in «a», which often feels arbitrary. Hence, it is common to wrap the call to Unique is a call to SortIndex such as
SortIndex(Unique(a))
History
The «I
» parameter was first made optional in Analytica 5.0. Prior to that, the Unique(a)
usage was not available. Without to omitted index, use a[I = Unique(a, I)]
or #SetDifference(\a)
to define an index.
The «mapToUnique» parameter was introduced in Analytica 5.0.
The «resultIndex» parameter is present in Analytica 4.3 and later, but hidden (doesn't show up in Expression Assist, etc.) until Analytica 5.0. You can still use it in those earlier releases even though it is hidden.
See also
- SortIndex
- Subset
- Set Functions -- These also remove duplicates from their results.
Enable comment auto-refresher