Difference between revisions of "Rank"
m (adding doc status category) |
|||
Line 12: | Line 12: | ||
* 0 = mid-rank | * 0 = mid-rank | ||
* 1 = upper-rank | * 1 = upper-rank | ||
+ | * [[Null]] = unique rank (''new to 4.2'') | ||
= Description = | = Description = | ||
Line 18: | Line 19: | ||
(new to 4.0) The rank type determines how values in X that occur multiple times are ranked. For example, if the value x=5 occurs 6 times, and there are 3 other values in X that are less than 5, then x=5 appears in the sort-order at positions 4,5,6,7,8, and 9. The lower-rank of x=5 is 4, the upper-rank of x=5 is 9, and the mid-rank is 6.5. Note that a mid-rank is not necessary a valid index position (since it may be fractional), so if you intend to use the result of Rank in a slice function, you should use either the lower-rank (the default) or the upper-rank. | (new to 4.0) The rank type determines how values in X that occur multiple times are ranked. For example, if the value x=5 occurs 6 times, and there are 3 other values in X that are less than 5, then x=5 appears in the sort-order at positions 4,5,6,7,8, and 9. The lower-rank of x=5 is 4, the upper-rank of x=5 is 9, and the mid-rank is 6.5. Note that a mid-rank is not necessary a valid index position (since it may be fractional), so if you intend to use the result of Rank in a slice function, you should use either the lower-rank (the default) or the upper-rank. | ||
+ | |||
+ | (new to 4.2) When you want every element to be assigned a unique rank, you can specify ''rankType:null''. Tied elements will be given unique ranks (with their position in the original array being used to break the tie). | ||
The [[RankCorrel]] function also allows you to select which rank type to use, but it uses mid-rank by default in Analytica 4.0. | The [[RankCorrel]] function also allows you to select which rank type to use, but it uses mid-rank by default in Analytica 4.0. | ||
+ | |||
+ | = Examples = | ||
+ | |||
+ | = Advanced Variations = | ||
+ | |||
+ | == Case Sensitivity == | ||
+ | |||
+ | When ranking text values, [[Rank]] treats text values as being case-sensitive with capital letters preceding lower case letters. So, for example, "Zebra" gets a lower rank than "apply". In Analytica 4.2 or later, you can supply the optional parameter ''caseInsensitive:true'': | ||
+ | [[Rank]](D,I,caseInsensitive:true) | ||
+ | |||
+ | Case sensitivity only impacts text values. | ||
+ | |||
+ | == Descending rank == | ||
+ | |||
+ | You can easily reverse the rank of numeric arrays, where the largest numbers receive the lowest ranks, by simply using: | ||
+ | [[Rank]](-X,I) | ||
+ | |||
+ | For arrays containing text values, in Analytica 4.2 and later you can specify the optional ''descending:true'' parameter: | ||
+ | [[Rank]](X,I,descending:true) | ||
+ | which uses the reverse rank order for text as well as numbers. | ||
+ | |||
+ | == Multi-key ranking == | ||
+ | |||
+ | When two values are tied for the same rank, a second array can be used to break the tie. This is refered to as a multi-key sort (or multi-key rank). The first array is the primary key, the next is the secondary key, and so on. Analytica 4.2's [[Rank]] function support multi-key ranking. To use, your keys must all share a common index, ''I''. You must then introduce a new index, ''keyIndex'', to dimension your keys (any number of keys may be used), and bundle your keys into a 2-D array indexed by ''I'' and ''keyIndex''. For example, the following ranks by age, then for ties by gender: | ||
+ | [[Index..Do|Index]] keyIndex := ['age','gender']; | ||
+ | [[Rank]]( Array(keyIndex,[age,gender]), I, keyIndex ) | ||
= See Also = | = See Also = |
Revision as of 00:51, 11 November 2008
Returns an array of the rank vaules of X across index I.
Declaration
Rank( X : vector[I] ; I : optional Index ; type : optional atomic numeric = -1 )
where type, if specified, may be one of the following values:
- -1 = lower-rank (default)
- 0 = mid-rank
- 1 = upper-rank
- Null = unique rank (new to 4.2)
Description
The lowest value in X has a rank value of 1, the next-lowest has a rank of 2, and so on. I is optional if X is one-dimensional. If I is omitted when X has more than one dimension, the innermost dimension is ranked. Since you, as a modeler, have little control over which dimension is the inner dimension, you should always specify I unless you can guarantee that X will always be one-dimensional.
(new to 4.0) The rank type determines how values in X that occur multiple times are ranked. For example, if the value x=5 occurs 6 times, and there are 3 other values in X that are less than 5, then x=5 appears in the sort-order at positions 4,5,6,7,8, and 9. The lower-rank of x=5 is 4, the upper-rank of x=5 is 9, and the mid-rank is 6.5. Note that a mid-rank is not necessary a valid index position (since it may be fractional), so if you intend to use the result of Rank in a slice function, you should use either the lower-rank (the default) or the upper-rank.
(new to 4.2) When you want every element to be assigned a unique rank, you can specify rankType:null. Tied elements will be given unique ranks (with their position in the original array being used to break the tie).
The RankCorrel function also allows you to select which rank type to use, but it uses mid-rank by default in Analytica 4.0.
Examples
Advanced Variations
Case Sensitivity
When ranking text values, Rank treats text values as being case-sensitive with capital letters preceding lower case letters. So, for example, "Zebra" gets a lower rank than "apply". In Analytica 4.2 or later, you can supply the optional parameter caseInsensitive:true:
Rank(D,I,caseInsensitive:true)
Case sensitivity only impacts text values.
Descending rank
You can easily reverse the rank of numeric arrays, where the largest numbers receive the lowest ranks, by simply using:
Rank(-X,I)
For arrays containing text values, in Analytica 4.2 and later you can specify the optional descending:true parameter:
Rank(X,I,descending:true)
which uses the reverse rank order for text as well as numbers.
Multi-key ranking
When two values are tied for the same rank, a second array can be used to break the tie. This is refered to as a multi-key sort (or multi-key rank). The first array is the primary key, the next is the secondary key, and so on. Analytica 4.2's Rank function support multi-key ranking. To use, your keys must all share a common index, I. You must then introduce a new index, keyIndex, to dimension your keys (any number of keys may be used), and bundle your keys into a 2-D array indexed by I and keyIndex. For example, the following ranks by age, then for ties by gender:
Index keyIndex := ['age','gender']; Rank( Array(keyIndex,[age,gender]), I, keyIndex )
Enable comment auto-refresher