Covariance
Computes an estimate of the covariance or weighted covariance between two quantities. The covariance is a measure of the amount of linear dependence. A covariance of 0 indicates that the two quantities appear to be independent (although if there is a non-linear relationship, this is not necessarily the case), a positive value indicates that they tend to increase together, while a negative correlation indicates that an increase in one quantity tends to be accompanied by a decrease in the other.
Correlation is a related measure of linear dependence. Correlation is the covariance normalized by the standard deviations so that the result ranges from -1 to 1. The covariance is not limited to any particular range.
The covariance of a quantity with itself is its Variance.
The Gaussian function accepts a covariance matrix as a parameter when specifying a multi-variate distribution.
Simple Usage
When X and Y are both uncertain quantities, the covariance is computed by
Covariance(X,Y)
Covariance of Data
If you have a data set containing two variables, A and B, where data points are indexed by J, the covariance of A and B is computed using
Covariance(A,B,J)
Here J is referred to as the running index.
If you have an array in which you want to find the covariance of two columns, then you will apply the subscript operator to extract each column. For example, the following computes the covariance between historical revenue in 2002 and 2003 (where data points are indexed by J).
Covariance( HistoricalRevenue[Year=2002], HistoricalRevenue[Year=2003], J )
Weighted Covariance
Unweighted covariance treats all data or sample points as equally weighted. Weighted covariance computes the covariance when each data point may have a different weight. The optional w parameter may be used to specify a weight, which should be indexed by the running index (or by Run if no running index is specified). For example, the following specifies an importance weight:
Covariance( X,Y, w:sampleImportance )
The global sample weighting, specified by the system variable SampleWeighting, is used by default.
Computing a Sample Covariance Matrix
Suppose each sample point is a vector along index I. A covariance matrix is a 2-D square symmetric matrix where each element (m,n) indicates the covariance of column I=m and column I=n. To compute a sample covariance matrix for a given set of data, create a second index, I2 as a copy of I:
Index I2 := CopyIndex( I )
With this index, the covariance matrix, index by I and I2, is computed from data X using:
Covariance( X, X[I=I2] )
Or if data points are listed along an index other than Run, say J, this would be:
Covariance( X, X[I=I2], J )
A covariance matrix computed from a data set is always symmetric and positive semi-definite.
Full Declaration
Covariance(X,Y : Numeric ContextSamp[I] ; I : IndexType=Run ; w : NonNegative ContextSamp[I] = SampleWeighting )
Mathematical Details
Weighted covariance is given by [math]\displaystyle{ {\sum_i { w_i \hat{x}_i \hat{y}_i } \over {\sum_i w_i (1-w_i) } } }[/math]
See Also
- Correlation
- Statistical Functions and Importance Weighting
- RankCorrel (the Rank Correlation function)
- Gaussian
Enable comment auto-refresher