Wilcoxon Distribution

Release:	… • 6.0 • 6.1 • 6.2 • 6.3 • 6.4 • 6.5 • 6.6

Wilcoxon(m, n, exact)

ProbWilcoxon(u, m, n, exact)

CumWilcoxon(u, m, n, exact)

CumWilcoxonInv(p, m, n, exact)

The Wilcoxon distribution is a discrete, bell-shaped, non-negative distribution, which describes the distribution of the U-statistic in the Mann-Whitney-Wilcoxon Rank-Sum test when comparing two unpaired samples drawn from the same arbitrary distribution. The rank-sum test is perhaps the most commonly used non-parametric significance test in statistics to detect when one distribution is stochastically greater (or not-equal) to another without making assumption that the underlying distributions are normally distributed.

The Wilcoxon distribution function in Analytica returns a random sample from the Wilcoxon distribution (or the Mid-value when evaluated in Mid-mode. When performing a rank-sum statistical test, the related functions CumWilcoxon can be used to compute the p-Value, or CumWilcoxonInv to compute the rejection threshold for a given significance level. ProbWilcoxon gives the probability density analytically (i.e., without using a Monte Carlo sample). Random(Wilcoxon(m, n)) can be used to generate single random variates.

The distribution is parameterized by two non-negative numbers: «m» and «n». In a rank-sum test, these correspond to the sample sizes of the data measured from each of the two populations.

Library

Distributions Library (all Wilcoxon functions are built-in functions)

The U-Statistic

Suppose you are given «m» observations from one population and «n» observations from a second population. It is assumed that the observations are ordinal (i.e., have a natural ordering, or less-than relationship). Because they are ordered, you can determine the rank of every observation among all m + n observations. The smallest observation is assigned a rank of 1, and the largest a rank of m + n.

For example, suppose your observations consistent of numeric measurements, and you have observed the following measurements:

From Population 1:

[12.3, 2.3, 8.3]

From Population 2:

[2.4, 18.1, 1.3, 5.5]

The ranks would be:

Population 1 ranks:

[6, 2, 5]

Population 2 ranks:

[3, 7, 1, 4]

The U-statistic is based entirely on the ranks, rather than on the actual observed values. This eliminates any dependence on a specific distribution type. Let R₁ be the sum of the ranks in Population 1. The U-statistic is defined as:

[math]\displaystyle{ U=R1 - {{m(m+1)}\over 2} }[/math]

In the example, [math]\displaystyle{ R_1=13 }[/math] and [math]\displaystyle{ U=7 }[/math].

An equivalent method of obtaining U is to count, for each rank in Population 1, the number of observations in Population 2 that have a smaller rank (using 0.5 for ties). The sum of these counts is U. This second method is more difficult to implement or carry out, but makes it easier to interpret what U represents. If the distributions are the same, then the average count would be n/2, and hence U would be m*n/2. When U differs from this, it is evidence that the two distributions are not equal.

The Analytica expressions that can be used to compute U from sample D1 indexed by I1 and sample D2 indexed by I2 are as follows:

Variable Sample1Ranks :=

Index I := Concat(@I1, @I2);

Var allRanks := Rank(Concat(D1, D2, I1, I2, I), I, type: 0);

allRanks[I = @I1]

Variable U :=

Var R1 := Sum(Sample1Ranks, I1);

R1 - m*(m + 1)/2

Mann-Whitney-Wilcoxon Rank-Sum Test

Suppose you have the hypothesis that the distribution of a measurable quantity in population 1 is stochastically less than the distribution of the same quantity in population 2. Do test this, you carry out an experiment, taking «m» measurements from Population 1 and «n» measurements from Population 2. The U-statistic for these is a bit less than m*n/2. Does this mean that your hypothesis is correct?

To determine whether this experimental evidence provides statistically significant confirmation for your hypothesis, compute CumWilcoxon(u, m, n).

The result is the probability that you would see a U-value as small or smaller than the one observed if the populations were not different. This probability is known as the p-Value. Typically, when this p-Value is less than 5%, then one says that there is statistically significant support for the hypothesis.

You can also compute the U-threshold using CumWilcoxonInv(1 - p, m, n), where «p» is the statistical significance level (e.g., 1 - p = 5% when you want a 95% confidence level. When your measured U-statistic is less than or equal to this value, then you would conclude that the hypothesis is supported as a statistically significant level.

In statistical parlance, the MWW rank-sum test is a non-parametric test for comparing two unpaired samples.

Relationship to Parametric Tests

The Student's t-Test is the best-known parametric test for determining whether one distribution is stochastically less than (or not-equal) to a second distribution, when both distributions are known to be normally distributed, or at least approximately so. Thus, the key distinction between the rank-sum test and the t-test is the distributional assumption. The rank-sum test is said to be non-parametric since it does not make an assumption about the underlying distributions.

Because of the additional assumption, the t-test is usually more powerful, meaning that statistical significance can often be detected with fewer measurements. However, the rank-sum test tends to do pretty well in this regard, and isn't dramatically less powerful than the t-test. However, it is far more robust than the t-test, which can make it preferable when outliers are present or your distributions aren't really Normal.

Computation Time and Memory

The Wilcoxon distribution can require large amounts of time and memory to compute (this is true of all the functions, Wilcoxon, ProbWilcoxon, CumWilcoxon and CumWilcoxonInv), especially when «m» and «n» get large. However, at the same time, as «m» and «n» get large, the distribution approaches a Normal distribution. Hence, the functions automatically switch over to a Normal-approximation when the sum of «m» and «n» exceeds 100. At that point, the accuracy of the error for ProbWilcoxon or CumWilcoxon tends to be 0.1% or less (this is just by observation, not a proven bound). You can explicitly control when the exact or approximate computation is used by specifying the boolean «exact» parameter. When specified as true, the exact algorithm is used (which can easily exhaust memory or take an exorbitant amount of time for very large values). You can switch over to the approximation sooner to save on time and memory by specifying an expression for the «exact» parameter, such as:

ProbWilcoxon(m, n, exact: m + n > 50)

History

Introduced in Analytica 4.5.