Unique
Unique(a, I
, position, caseInsensitive, resultIndex, mapToUnique, condition)
Without the index parameter «I»
, it returns a list containing all the unique atoms in «a» (removing all duplicates). If «a» is a multidimensional array, the result the slices that are the unique across all dimensions.
When an index «I»
is specified, it returns a subset of index «I»
where each slice of array «a» along «I»
is unique. You can use it to remove duplicate slices from an array, or to identify a single member of each equivalence class.
Example
Let:
Variable DataSet :=
Field ▶ PersonNum ▼ LastName FirstName Company 1 Smith Bob Acme 2 Jones John Acme 3 Johnson Bob Floorworks 4 Smith Bob Acme
Then:
Unique(DataSet) → ['Smith', 'Jones', 'Johnson', 'Bob', 'John', 'Acme', 'Floorworks' ]
{Requires Analytica 5.0}Unique(DataSet, PersonNum) → [1, 2, 3]
Unique(DataSet[Field = 'Company'], PersonNum) → [1, 3]
Optional parameters
I
The index parameter «I
» is optional. When omitted, it finds the unique atomic values among all cells of «a». When specified, it returns values from «I
» with unique values (slices).
Position
By default, Unique returns the elements of the index. If optional parameter «position» is true
(position: true
) it returns the positions of the elements in «I»
, rather than their values (see Associative vs. Positional Indexing).
This parameter is not used when the «I»
index parameter is omitted.
CaseInsensitive
When applying Unique to text values, values are considered by default in a case-sensitive fashion, for example, "Apple" and "apple" are considered distinct elements.
Specifying caseInsensitive: true
ignores differences in upper and lower case in text values when determining if values are unique.
ResultIndex
If you provide an Index Result
for parameter «ResultIndex», the resulting unique values are in an array indexed by Result
. If Result
is shorter than the number of unique items, it omits the unique values after the first n items that fit, where n is the size of the index. When Result
is too long, it fills out the extra cells with null.
«ResultIndex» is useful when you want to array abstract. For example, in a 2-D array A
, you may want to identify the unique items along I
separately for each item in index J
:
For jj := J Do Unique(a[J = jj], I, resultIndex: I)
Without the «resultIndex» parameter, each iteration would return a list, and the For loop would then need to combine lists with incompatible implicit indexes, which would give an error. By ensuring that each result has an explicit index -- I
in this example -- the results can be successfully combined.
This For loop example is not equivalent to:
Unique(a, I, resultIndex: I)
The reason is that Unique(a, I)
compares entire slices -- it isn't operating over each slice of the exogenous dimensions separately as most other array functions do.
mapToUnique
(new to Analytica 5.0)
Unique(A,I,mapToUnique:true)
returns an array indexed by «I» which maps from each element of «I» back to the first element in A that has that same value. The first element with a given value maps to itself, and is the element that would have been returned if «mapToUnique» was not specified. For example:
- Index
I := 'a'..'e'
- Variable
A :=
Table(I)(4,2,4,3,2)
Unique(A, I, mapToUnique:true) → Array(I,['a','b','a','c','b'])
Unique(A, I, position:true, mapToUnique:true) → Array(I,[1,2,1,3,2])
- Index
One situation where this is useful is when you use Unique(A,I)
to find the unique slices, so that you can compute an expensive function only on the unique slices. But then you figure out which result to use for each of the other slices, which «mapToUnique» gives you.
Since «mapToUnique» returns the position along «I
», you must specify the index «I
» when using «mapToUnique».
condition
(new to Analytica 5.3)
When you specify «condition», it finds the unique values from only those items that match the «condition». For example, if you don't want to include Null values, you can use:
Unique(A, condition:A<>Null)
or to find only unique text values (when there might be other data types present such as numbers):
Unique(A, condition:IsText(A))
When «mapToUnique» is true, Null is returned for unmatched items along «I». When «condition» has an index not present in «A», the item of «A» is included if «condition» is true anywhere along that extra index.
Notes
The Set Functions such as SetDifference or SetUnion ensure that no duplicates exist in the final result, and hence can also be used to find the unique elements. For example, when L
is a 1-D array or list
#SetDifference(\L)
and
Unique(L)
do the same thing. Since no set is being subtracted, SetDifference returns the set \L
after duplicates are removed. In some instances, SetDifference has advantages over Unique. For example, if you also want to ignore certain values, including Null or others, you could compute the unique elements and then follow that with a call to SetDifference to remove the other values, but when so doing, you might as well skip the call to Unique entirely since SetDifference already does that for you.
The ordering of elements in the result follows the ordering of the elements in «a», which often feels arbitrary. Hence, it is common to wrap the call to Unique is a call to SortIndex such as
SortIndex(Unique(a))
History
The «I
» parameter was first made optional in Analytica 5.0. Prior to that, the Unique(a)
usage was not available. Without to omitted index, use a[I = Unique(a, I)]
or #SetDifference(\a)
to define an index.
The «mapToUnique» parameter was introduced in Analytica 5.0.
The «resultIndex» parameter is present in Analytica 4.3 and later, but hidden (doesn't show up in Expression Assist, etc.) until Analytica 5.0. You can still use it in those earlier releases even though it is hidden.
See also
- SortIndex
- Subset
- Set Functions -- These also remove duplicates from their results.
Enable comment auto-refresher