Difference between revisions of "RegressionDist"

 
m (formatted example code)
Line 16: Line 16:
 
If you know the noise level S in advance, then you can use historical data as a starting point for building a predictive model of Y, as follows:
 
If you know the noise level S in advance, then you can use historical data as a starting point for building a predictive model of Y, as follows:
  
{ Your model of the dependent variables: }
+
{ Your model of the dependent variables: }
Variable Y := your historical dependent data, indexed by I
+
Variable Y := your historical dependent data, indexed by I
Variable B := your historical independent data, indexed by I,K
+
Variable B := your historical independent data, indexed by I,K
Variable X := { indexed by K.  Maybe others.  Possibly uncertain }
+
Variable X := { indexed by K.  Maybe others.  Possibly uncertain }
Variable S := { the known noise level }
+
Variable S := { the known noise level }
Chance C := RegressionDist(Y,B,I,K)
+
Chance C := '''RegressionDist'''(Y,B,I,K)
Variable Predicted_Y := [[Sum]](C*X,K) + [[Normal]](0,S)
+
Variable Predicted_Y := [[Sum]](C*X,K) + [[Normal]](0,S)
  
 
If you don't know the noise level, then you need to estimate it. You'll need it for the normal term of Predicted_Y anyway, and you'll need to do a regression to find it.  So you can pass these optional parameters into RegressionDist.  The last three lines above become:
 
If you don't know the noise level, then you need to estimate it. You'll need it for the normal term of Predicted_Y anyway, and you'll need to do a regression to find it.  So you can pass these optional parameters into RegressionDist.  The last three lines above become:
Variable E_C := Regression(Y,B,I,K)
+
Variable E_C := [[Regression]](Y,B,I,K)
Variable S := [[RegressionNoise]]( Y,B,I,K,E_C )
+
Variable S := [[RegressionNoise]]( Y,B,I,K,E_C )
Chance C := RegressionDist(Y,B,I,K,E_C)
+
Chance C := '''RegressionDist'''(Y,B,I,K,E_C)
Variable Predicted_Y := [[Sum]](C*X,K) + [[Normal]](0,S)
+
Variable Predicted_Y := [[Sum]](C*X,K) + [[Normal]](0,S)
  
 
If you use [[RegressionNoise]] to compute S, you should use [[Mid]]([[RegressionNoise]](...)) for the S parameter.  However, when computing S for your prediction, don't [[RegressionNoise]] in context.  Better is if you don't know the measurement noise in advance, don't supply it as a parameter.
 
If you use [[RegressionNoise]] to compute S, you should use [[Mid]]([[RegressionNoise]](...)) for the S parameter.  However, when computing S for your prediction, don't [[RegressionNoise]] in context.  Better is if you don't know the measurement noise in advance, don't supply it as a parameter.

Revision as of 08:39, 4 May 2007

RegressionDist(Y,B,I,K,C,S)

RegressionDist returns linear regression coefficients as a distribution.

Suppose you have data where Y was produced as:

 Y = Sum( C*B, K ) + Normal(0,S)

S is the measurement noise. You have the data (B[I,K] and Y[I]). You might or might not know the measurement noise S. So you perform a linear regression to obtain an estimate of C. Because your estimate is obtained from a finite amount of data, your estimate of C is itself uncertain. This function returns the coefficients C as a distribution (i.e., in Sample mode, it returns a sampling of coefficients indexed by Run and K), reflecting the uncertainty in the estimation of these parameters.

Library

Multivariate Distributions.ana

Examples

If you know the noise level S in advance, then you can use historical data as a starting point for building a predictive model of Y, as follows:

{ Your model of the dependent variables: }
Variable Y := your historical dependent data, indexed by I
Variable B := your historical independent data, indexed by I,K
Variable X := { indexed by K.  Maybe others.  Possibly uncertain }
Variable S := { the known noise level }
Chance C := RegressionDist(Y,B,I,K)
Variable Predicted_Y := Sum(C*X,K) + Normal(0,S)

If you don't know the noise level, then you need to estimate it. You'll need it for the normal term of Predicted_Y anyway, and you'll need to do a regression to find it. So you can pass these optional parameters into RegressionDist. The last three lines above become:

Variable E_C := Regression(Y,B,I,K)
Variable S := RegressionNoise( Y,B,I,K,E_C )
Chance C := RegressionDist(Y,B,I,K,E_C)
Variable Predicted_Y := Sum(C*X,K) + Normal(0,S)

If you use RegressionNoise to compute S, you should use Mid(RegressionNoise(...)) for the S parameter. However, when computing S for your prediction, don't RegressionNoise in context. Better is if you don't know the measurement noise in advance, don't supply it as a parameter.

See Also

Comments


You are not allowed to post comments.