Difference between revisions of "Rank"

Latest revision as of 05:22, 17 February 2024

Rank(x, i)

Rank(x, i) returns an array of the rank values of «x» across index «i». The lowest value in «x» has a rank value of 1, the next-lowest has a rank of 2, and so on. «i» is optional if «x» is one-dimensional. If «i» is omitted when «x» has more than one dimension, the innermost dimension is ranked. Since you, as a modeler, have little control over which dimension is the inner dimension, you should always specify «i» unless you can guarantee that «x» will always be one-dimensional.

Examples

In the example below, the rank of a list is evaluated. Thus, the second parameter «i» is unnecessary. A one-dimensional array is returned, indexed by Years. This array has a value of 1 where Year has the smallest value, a value of 2 where Year has the second smallest value, and so on.

Rank(Years) →

Years ▶
2005	2006	2007	2008	2009
1	2	3	4	5

In the example below, the Rank function works with a multidimensional table. Each car type is given a rank for each year, with the cheapest car in that year given 1 and the most expensive car in that year given 3.

Rank(Car_prices, Car_type) →

	Years ▶
Car_Type ▼	2005	2006	2007	2008	2009
VW	1	1	1	1	1
Honda	2	2	2	2	2
BMW	3	3	3	3	3

Optional Parameters

Type

If two (or n) values are equal, they receive the same rank and the next higher value receives a rank 2 (or n) higher. You can use an optional parameter, «Type», to control which rank is assigned to equal values. By default, the lowest rank is used, equivalent to Rank(x, i, Type: -1) Alternatively, Rank(x, i, Type: 0) uses the mid-rank and Rank(x, i, Type:1) uses the upper-rank. Rank(x, i, Type: Null) assigns a unique rank to every element (the numbers 1 thru n) in which tied elements may have different ranks.

Example

The example below shows how the Rank function handles duplicate values.

Rank(NumRepairs, CarNum, Type: RankType) →

	CarNum ▶
Rank Type ▼	1	2	3	4	5	6	7
-1	7	2	6	2	2	1	2
0	7	3.5	6	3.5	3.5	1	3.5
1	7	5	6	5	5	1	5
Null	7	2	6	3	4	1	6

For Type = -1, lowest rank for the duplicate value is returned.

For Type = 0, mid rank value of the duplicates is returned.

For Type = 1, highest rank for the duplicate value is returned.

For Type = Null, unique rank value is returned.

Case Sensitivity

When ranking text values, Rank treats text values as being case-sensitive with capital letters preceding lower case letters. So, for example, Zebra gets a lower rank than apply. In Analytica 4.2 or later, you can supply the optional parameter caseInsensitive: true:

Rank(x, i, caseInsensitive: true)

Case sensitivity only impacts text values.

Descending rank

You can easily reverse the rank of numeric arrays, where the largest numbers receive the lowest ranks, by simply using Rank(-X, i).

For arrays containing text values, in Analytica 4.2 and later you can specify the optional descending: true parameter

Rank(x, i, descending: true)

which uses the reverse rank order for text as well as numbers.

Multi-key ranking

When two values are tied for the same rank, a second array can be used to break the tie. This is referred to as a multi-key sort (or multi-key rank). The first array is the primary key, the next is the secondary key, and so on. Analytica 4.2's Rank function support multi-key ranking. To use, your keys must all share a common index, «i». You must then introduce a new index, «keyIndex», to dimension your keys (any number of keys may be used), and bundle your keys into a 2-D array indexed by «i» and «keyIndex». For example, the following ranks by age, then, in case of a tie, by gender:

Index keyIndex := ['age', 'gender'];

Rank(Array(keyIndex, [age, gender]), i, keyIndex:keyIndex)

Example

This example shows how multi-key rank can be calculated by passing the optional «keyIndex» parameter to the rank function. The rank of the cars by the number of maintenance events using index maintType as the «keyIndex» is returned.

Table NumMaintEvents:

	CarNum ▶
MaintType ▼	1	2	3	4	5	6	7
Repair	10	4	9	4	4	1	4
Scheduled	0	2	0	1	2	0	5
Tires	0	2	0	0	1	0	0

Rank(NumMaintEvents, CarNum, keyIndex: maintType ) →

	CarNum ▶
▼	1	2	3	4	5	6	7
	7	4	6	2	3	1	5

Treatment of NaN and Null values

When a NaN value occurs in your data, Rank can either pass it through as a NaN or assign it a numeric rank. The optional «passNaNs» parameter controls this behavior. The special value NaN indicates an indeterminate real number, which in theory cannot be compared to other numbers, so the ordering of NaN by Rank doesn't actually make logical sense. By passing NaN values through without assigning an actual rank, you may catch errors in your model that lead to the introduction of the NaN value in the first place, since these errors continue to be propagated to your results. This is often a desirable property.

By default, Rank assigns an arbitrary ranking to NaN values -- placing them between -INF and any finite numeric value. To pass NaNs, include the optional parameter passNaNs: true

Rank(x, i, passNaNs: true).

Null values are sometimes used for missing data, and in these cases can also be passed using the passNulls: true parameter. When this is not specified, null values are assigned a rank (with Null coming after all numeric values).

@@ Line 6: / Line 6: @@
 ==Rank(x, i)==
-Rank(x,i) returns an array of the rank values of «x»''' '''across index «i». The lowest value in «x» has a rank value of 1, the next-lowest has a rank of 2, and so on.  «i» is optional if «x» is one-dimensional.   If «i» is omitted when «x» has more than one dimension, the innermost dimension is ranked.  Since you, as a modeler, have little control over which dimension is the inner dimension, you should always specify «i» unless you can guarantee that «x» will always be one-dimensional.
+[[Rank]](x, i) returns an array of the rank values of «x» across index «i». The lowest value in «x» has a rank value of 1, the next-lowest has a rank of 2, and so on.  «i» is optional if «x» is one-dimensional. If «i» is omitted when «x» has more than one dimension, the innermost dimension is ranked.  Since you, as a modeler, have little control over which dimension is the inner dimension, you should always specify «i» unless you can guarantee that «x» will always be one-dimensional.
 ==Examples ==
-In the example below, the rank of a list is evaluated. Thus, the second parameter "I" is unnecessary. A one-dimensional array is returned, indexed by "Years". This array has a value of 1 where "Year" has the smallest value, a value of 2 where "Year" has the second smallest value, and so on.
+In the example below, the rank of a list is evaluated. Thus, the second parameter «i» is unnecessary. A one-dimensional array is returned, indexed by <code>Years</code>. This array has a value of 1 where <code>Year</code> has the smallest value, a value of 2 where <code>Year</code> has the second smallest value, and so on.
+:<code>Rank(Years) &rarr;</code>
-Rank(Years) →
+:{| class="wikitable"
-{| border="1"
+! colspan="5" | Years &#9654;
-! !! colspan="5" style="text-align: left;" | Years &#9654;
 |-
-! style="width:75px;" |''' '''
 ! style="width:75px;" |'''2005            '''
 ! style="width:75px;" |'''2006            '''
@@ Line 22: / Line 20: @@
 ! style="width:75px;" |'''2009            '''
 |-
-|
 | 1
 | 2
@@ Line 30: / Line 27: @@
 |}
-In the example below, the Rank function works with a multidimensional table. Each car type is given a rank for each year, with the cheapest car in that year given "1" and the most expensive car in that year given "3"
+In the example below, the [[Rank]] function works with a multidimensional table. Each car type is given a rank for each year, with the cheapest car in that year given 1 and the most expensive car in that year given 3.
+:<code>Rank(Car_prices, Car_type) &rarr;</code>
-Rank(Car_prices, Car_type) →
+:{| class="wikitable"
-{| border="1"
+! !!  colspan="5" | Years &#9654;
-! !! style="text-align: left;" colspan="5" | Years &#9654;
 |-
-! style="width:75px;" |''' '''
+! Car_Type &#9660;
-! style="width:75px;" |'''2005            '''
+! 2005
-! style="width:75px;" |'''2006            '''
+! 2006
-! style="width:75px;" |'''2007            '''
+! 2007
-! style="width:75px;" |'''2008            '''
+! 2008
-! style="width:75px;" |'''2009            '''
+! 2009
 |-
 |'''VW'''
@@ Line 68: / Line 64: @@
 === Type ===
-If two (or N) values are equal, they receive the same rank and the next higher value receives a rank 2 (or N) higher. You
+If two (or ''n'') values are equal, they receive the same rank and the next higher value receives a rank 2 (or ''n'') higher. You can use an optional parameter, «Type», to control which rank is assigned to equal values. By default, the lowest rank is used, equivalent to <code>Rank(x, i, Type: -1)</code> Alternatively, <code>Rank(x, i, Type: 0)</code> uses the mid-rank and <code>Rank(x, i, Type:1)</code> uses the upper-rank. <code>Rank(x, i, Type: Null)</code> assigns a unique rank to every element (the numbers 1 thru ''n'') in which tied elements may have different ranks.
-can use an optional parameter, '''Type''', to control which rank is assigned
-to equal values. By default, the lowest rank is used, equivalent to '''Rank(x, i, Type:-1)'''. Alternatively, '''Rank(x, i, Type:0) '''uses the mid-rank and '''Rank(x, i, Type:1) '''uses the upper-rank. '''Rank(x, i, Type:Null) '''assigns a unique rank to every element (the numbers 1 thru N) in which tied elements may have different ranks.
-{| border="1"
+==== Example ====
-! !! style="text-align: left;" colspan="7" | Rank Type '''&#9660;''', CarNum &#9654;
+The example below shows how the [[Rank]] function handles duplicate values.
+:<code>Rank(NumRepairs, CarNum, Type: RankType) &rarr;</code>
+:{| class="wikitable"
+! !! colspan="7" |  CarNum &#9654;
 |-
-! style="width:75px;" |''' '''
+! Rank Type &#9660;
-! style="width:75px;" |'''1'''
+! 1
-! style="width:75px;" |'''2'''
+! 2
-! style="width:75px;" |'''3'''
+! 3'''
-! style="width:75px;" |'''4'''
+! 4
-! style="width:75px;" |'''5'''
+! 5
-!6
+! 6
-!7
+! 7
 |-
 |'''-1'''
@@ Line 121: / Line 118: @@
 |}
+:For <code>Type = -1</code>, lowest rank for the duplicate value is returned.
+:For <code>Type = 0</code>, mid rank value of the duplicates is returned.
+:For <code>Type = 1</code>, highest rank for the duplicate value is returned.
+:For <code>Type = Null</code>, unique rank value is returned.
 ===Case Sensitivity===
-When ranking text values, [[Rank]] treats text values as being case-sensitive with capital letters preceding lower case letters.  So, for example, "Zebra" gets a lower rank than "apply".  In Analytica 4.2 or later, you can supply the optional parameter ''caseInsensitive:true'':
+When ranking text values, [[Rank]] treats text values as being case-sensitive with capital letters preceding lower case letters.  So, for example, <code>Zebra</code>  gets a lower rank than <code>apply</code>.  In Analytica 4.2 or later, you can supply the optional parameter <code>caseInsensitive: true</code>:
+:<code>Rank(x, i, caseInsensitive: true)</code>
-  [[Rank]](D,I, caseInsensitive:true)
 Case sensitivity only impacts text values.
@@ Line 132: / Line 132: @@
 === Descending rank ===
-You can easily reverse the rank of numeric arrays, where the largest numbers receive the lowest ranks, by simply using:
+You can easily reverse the rank of numeric arrays, where the largest numbers receive the lowest ranks, by simply using <code>Rank(-X, i)</code>.
-  [[Rank]](-X, i)
+For arrays containing text values, in Analytica 4.2 and later you can specify the optional <code>descending: true</code> parameter
+:<code>Rank(x, i, descending: true)</code>
-For arrays containing text values, in Analytica 4.2 and later you can specify the optional ''descending:true'' parameter:
-  [[Rank]](X, i, descending:true)
 which uses the reverse rank order for text as well as numbers.
 === Multi-key ranking ===
-When two values are tied for the same rank, a second array can be used to break the tie.  This is referred to as a multi-key sort (or multi-key rank). The first array is the primary key, the next is the secondary key, and so on.  Analytica 4.2's [[Rank]] function support multi-key ranking.  To use, your keys must all share a common index, ''I''.  You must then introduce a new index, ''keyIndex'', to dimension your keys (any number of keys may be used), and bundle your keys into a 2-D array indexed by ''I'' and ''keyIndex''.  For example, the following ranks by age, then for ties by gender:
+When two values are tied for the same rank, a second array can be used to break the tie.  This is referred to as a multi-key sort (or multi-key rank). The first array is the primary key, the next is the secondary key, and so on.  Analytica 4.2's [[Rank]] function support multi-key ranking.  To use, your keys must all share a common index, «i».  You must then introduce a new index, «keyIndex», to dimension your keys (any number of keys may be used), and bundle your keys into a 2-D array indexed by «i» and «keyIndex».  For example, the following ranks by <code>age</code>, then, in case of a tie, by <code>gender</code>:
- [[Index..Do|<code>Index</code>]]<code> keyIndex := ['age','gender'];
- [[Rank]](Array(keyIndex,[age,gender]), i, keyIndex )</code>
-=== Treatment of NaN and Null values ===
+:<code>Index keyIndex := ['age', 'gender'];</code>
+:<code>Rank(Array(keyIndex, [age, gender]), i, keyIndex:keyIndex)</code>
-When a [[NaN]] value occurs in your data, [[Rank]] can either pass it through as a [[NaN]] or assign it a numeric rank.  The optional ''passNaNs'' parameter controls this behavior.  The special value [[NaN]] indicates an indeterminate real number, which in theory cannot be compared to other numbers, so the ordering of [[NaN]] by [[Rank]] doesn't actually make logical sense.  By passing NaN values through without assigning an actual rank, you may catch errors in your model that lead to the introduction of the [[NaN]] value in the first place, since these errors continue to be propagated to your results.  This is often a desirable property.
+====Example====
+This example shows how multi-key rank can be calculated by passing the optional «keyIndex» parameter to the rank function. The rank of the cars by the number of maintenance events using index <code>maintType</code> as the «keyIndex» is returned.
-By default, [[Rank]] assigns an arbitrary ranking to [[NaN]] values -- placing them between -[[INF]] and any finite numeric value.  To pass NaNs, include the optional parameter: [[Rank]](x, i, passNaNs:true)
+Table NumMaintEvents:
-[[Null]] values are sometimes used for missing data, and in these cases can also be passed using the passNulls:true parameter:
+:{| class="wikitable"
+! !! colspan="7" | CarNum &#9654;
-  [[Rank]](x, i, passNulls:true).
+|-
-When this is not specified, Null values are assigned a rank (with [[Null]] coming after all numeric values).
+! MaintType &#9660;
-===Optional Type Parameter Examples===
-This example shows how the Rank function handles duplicate values.
-For Type = -1, lowest rank for the duplicate value is returned.
-For Type = 0, mid rank value of the duplicates is returned.
-For Type = 1, highest rank for the duplicate value is returned.
-For Type = NULL, Unique rank value is returned.
-Index RankType := [-1,0,1, Null]
-Rank(NumRepairs,CarNum,Type:RankType →
-{| class="wikitable"
-!
 ! 1
 ! 2
@@ Line 180: / Line 163: @@
 ! 7
 |-
-! -1
+!'''Repair'''
-| 7 || 2 || 6 || 2 || 2 || 1 || 2
+| 10
+| 4
+| 9
+| 4
+| 4
+| 1
+| 4
 |-
-! 0
+!'''Scheduled'''
-| 7 || 3.5 || 6 || 3.5 || 3.5 || 1 || 3.5
+| 0
+| 2
+| 0
+| 1
+| 2
+| 0
+| 5
 |-
-! 1
+!'''Tires'''
-| 7 || 5 || 6 || 5 || 5 || 1 || 5
+| 0
-|-
+| 2
-! Null
+| 0
-| 7 || 2 || 6 || 3|| 4 || 1 || 5
+| 0
+| 1
+| 0
+| 0
 |}
-===Multi-Key Example===
+:<code>Rank(NumMaintEvents, CarNum, keyIndex: maintType ) &rarr;</code>
-Rank(NumMaintEvents,CarNum,KeyIndex:MaintType) →
-{| class="wikitable"
+:{| class="wikitable"
-!
+! !! colspan="7" | CarNum &#9654;
+|-
+! &#9660;
 ! 1
 ! 2
@@ Line 206: / Line 205: @@
 ! 7
 |-
 !
-| 7 || 4 || 6 || 2 || 3 || 1 || 5
+| 7
+| 4
+| 6
+| 2
+| 3
+| 1
+| 5
+|-
 |}
+=== Treatment of NaN and Null values ===
+When a [[NaN]] value occurs in your data, [[Rank]] can either pass it through as a [[NaN]] or assign it a numeric rank.  The optional «passNaNs» parameter controls this behavior.  The special value [[NaN]]  indicates an indeterminate real number, which in theory cannot be compared to other numbers, so the ordering of [[NaN]] by [[Rank]] doesn't actually make logical sense.  By passing [[NaN]] values through without assigning an actual rank, you may catch errors in your model that lead to the introduction of the [[NaN]] value in the first place, since these errors continue to be propagated to your results. This is often a desirable property.
+By default, [[Rank]] assigns an arbitrary ranking to [[NaN]] values -- placing them between -[[INF]] and any finite numeric value. To pass [[NaN]]s, include the optional parameter <code>passNaNs: true</code>
+:<code>Rank(x, i, passNaNs: true).</code>
+[[Null]] values are sometimes used for missing data, and in these cases can also be passed using the <code>passNulls: true</code> parameter. When this is not specified, null values are assigned a rank (with [[Null]] coming after all numeric values).
 ==See Also ==
 * [[Sort]]
 * [[SortIndex]]
 * [[RankCorrel]]
-* [[Examples variables]]
+* [[Example variables]]