Difference between revisions of "Variance"
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | |||
[[category:Statistical Functions]] | [[category:Statistical Functions]] | ||
+ | [[Category:Doc Status C]] <!-- For Lumina use, do not change --> | ||
+ | |||
+ | ==Variance(X, ''I, w'')== | ||
+ | Computes variance of an uncertain quantity «X», or the «w»-weighted sample variance of a data set. | ||
+ | |||
+ | With multivariate samples (where data point is a vector), the [[Variance]] function can also be used to compute the covariance (or weighted covariance) matrix from a data set. | ||
+ | |||
+ | If «X» is an uncertain quantity, dependent on Analytica distribution functions, the variance is obtained using [[Variance]](X). | ||
+ | |||
+ | «X» is evaluated in [[sample]] mode, and the variance along the [[Run]] index computed. | ||
+ | |||
+ | ==Optional parameters== | ||
+ | === I === | ||
+ | Given a data set indexed by «I», the sample variance along index «I» is computed using [[Variance]](X, I). | ||
+ | |||
+ | When the running index «I» is the system index [[Run]] (or not specified), the value of «X» is evaluated in [[Sample]] mode and the average value among numeric values computed. If the running index is anything other than [[Run]], then «X» is evaluated in context. | ||
+ | |||
+ | === W === | ||
+ | The weighted variance computing by assigning a different "weight" to each point. The weight vector <code>wt</code> should be indexed by «I» (or by [[Run]] if «I» is not specified), and the weighted variance is computed using one of these forms | ||
+ | :<code>Variance(X, w: wt)</code> | ||
+ | :<code>Variance(X, I, w:wt)</code> | ||
+ | |||
+ | When the «w» parameter is not specified, and the running index «I» is either the [[Run]] index or is not specified, then the weighting defaults to the value in the system variable [[SampleWeighting]]. | ||
+ | |||
+ | The weighted variance is defined as | ||
+ | :<math> | ||
+ | {\sum_i w_i (x-\bar{x})^2} \over { \sum_i w_i (1-w_i) } | ||
+ | </math> | ||
+ | |||
+ | where the sum is taken over numeric values, <math>\bar{x}</math> is the weighted mean, and where <math>\sum_i w_i = 1</math>. If ''Sum(w, i) <> 1'', the «w»'s are normalized, so that the sum over numeric values is 1. This is an unbiased estimator of the weighted variance. | ||
+ | |||
+ | When ''w<sub>i</sub>'' is constant, this simplifies to | ||
+ | |||
+ | :<math>{\sum_i (x-\bar)^2} \over {N-1}</math> | ||
+ | |||
+ | where ''N'' is the number of points ([[sampleSize]] when «I» is [[Run]]). | ||
+ | |||
+ | When one or more points with non-zero weight in x are [[INF]] or -[[INF]], Variance will return [[INF]] if ''[[Min]](x) < [[Max]](x)'', or [[INF]] if ''[[Min]](x) = +[[INF]]'' or ''[[Max]](x) = -[[INF]]''. If there are fewer than two numeric points with positive weight, Variance returns [[NaN]]. Any point with zero weight is ignored, so that [[INF]] or [[NaN]] values don't cause the result to become [[NaN]] if they are given a zero weight. | ||
+ | |||
+ | == See Also == | ||
+ | * [[Statistical Functions and Importance Weighting]] | ||
+ | * [[SDeviation]] | ||
+ | * [[Skewness]] | ||
+ | * [[Kurtosis]] | ||
+ | * [[Correlation]] | ||
+ | * [[Covariance]] |
Latest revision as of 21:51, 18 January 2016
Variance(X, I, w)
Computes variance of an uncertain quantity «X», or the «w»-weighted sample variance of a data set.
With multivariate samples (where data point is a vector), the Variance function can also be used to compute the covariance (or weighted covariance) matrix from a data set.
If «X» is an uncertain quantity, dependent on Analytica distribution functions, the variance is obtained using Variance(X).
«X» is evaluated in sample mode, and the variance along the Run index computed.
Optional parameters
I
Given a data set indexed by «I», the sample variance along index «I» is computed using Variance(X, I).
When the running index «I» is the system index Run (or not specified), the value of «X» is evaluated in Sample mode and the average value among numeric values computed. If the running index is anything other than Run, then «X» is evaluated in context.
W
The weighted variance computing by assigning a different "weight" to each point. The weight vector wt
should be indexed by «I» (or by Run if «I» is not specified), and the weighted variance is computed using one of these forms
Variance(X, w: wt)
Variance(X, I, w:wt)
When the «w» parameter is not specified, and the running index «I» is either the Run index or is not specified, then the weighting defaults to the value in the system variable SampleWeighting.
The weighted variance is defined as
- [math]\displaystyle{ {\sum_i w_i (x-\bar{x})^2} \over { \sum_i w_i (1-w_i) } }[/math]
where the sum is taken over numeric values, [math]\displaystyle{ \bar{x} }[/math] is the weighted mean, and where [math]\displaystyle{ \sum_i w_i = 1 }[/math]. If Sum(w, i) <> 1, the «w»'s are normalized, so that the sum over numeric values is 1. This is an unbiased estimator of the weighted variance.
When wi is constant, this simplifies to
- [math]\displaystyle{ {\sum_i (x-\bar)^2} \over {N-1} }[/math]
where N is the number of points (sampleSize when «I» is Run).
When one or more points with non-zero weight in x are INF or -INF, Variance will return INF if Min(x) < Max(x), or INF if Min(x) = +INF or Max(x) = -INF. If there are fewer than two numeric points with positive weight, Variance returns NaN. Any point with zero weight is ignored, so that INF or NaN values don't cause the result to become NaN if they are given a zero weight.
Enable comment auto-refresher