# Variance

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

## Variance(X, I, w)

Computes variance of an uncertain quantity «X», or the «w»-weighted sample variance of a data set.

With multivariate samples (where data point is a vector), the Variance function can also be used to compute the covariance (or weighted covariance) matrix from a data set.

If «X» is an uncertain quantity, dependent on Analytica distribution functions, the variance is obtained using Variance(X).

«X» is evaluated in sample mode, and the variance along the Run index computed.

## Optional parameters

### I

Given a data set indexed by «I», the sample variance along index «I» is computed using Variance(X, I).

When the running index «I» is the system index Run (or not specified), the value of «X» is evaluated in Sample mode and the average value among numeric values computed. If the running index is anything other than Run, then «X» is evaluated in context.

### W

The weighted variance computing by assigning a different "weight" to each point. The weight vector wt should be indexed by «I» (or by Run if «I» is not specified), and the weighted variance is computed using one of these forms

Variance(X, w: wt)
Variance(X, I, w:wt)

When the «w» parameter is not specified, and the running index «I» is either the Run index or is not specified, then the weighting defaults to the value in the system variable SampleWeighting.

The weighted variance is defined as

$\displaystyle{ {\sum_i w_i (x-\bar{x})^2} \over { \sum_i w_i (1-w_i) } }$

where the sum is taken over numeric values, $\displaystyle{ \bar{x} }$ is the weighted mean, and where $\displaystyle{ \sum_i w_i = 1 }$. If Sum(w, i) <> 1, the «w»'s are normalized, so that the sum over numeric values is 1. This is an unbiased estimator of the weighted variance.

When wi is constant, this simplifies to

$\displaystyle{ {\sum_i (x-\bar)^2} \over {N-1} }$

where N is the number of points (sampleSize when «I» is Run).

When one or more points with non-zero weight in x are INF or -INF, Variance will return INF if Min(x) < Max(x), or INF if Min(x) = +INF or Max(x) = -INF. If there are fewer than two numeric points with positive weight, Variance returns NaN. Any point with zero weight is ignored, so that INF or NaN values don't cause the result to become NaN if they are given a zero weight.