Difference between revisions of "Probability"

(Moved this to its own page)
Line 1: Line 1:
#REDIRECT [[Statistical Functions and Importance Weighting#Probability]]
 
 
[[category:Statistical Functions]]
 
[[category:Statistical Functions]]
 +
 +
== Probability( b'', I, w'' ) ==
 +
 +
Returns an estimate of the probability that «b» is true. This is computed as the fraction of samples for which the boolean condition «b» holds.
 +
 +
When the index «I» is not specified, the probability based on the uncertain chance variables in your model. In this case, the probability is taken over the [[Run]] index and «b» is evaluated in [[Evaluation Modes|sample mode]], so it is the probability based on the Monte Carlo sample.
 +
 +
Probability is equivalent to <code>[[Mean]](b<>False,I,w).
 +
 +
=== Example ===
 +
 +
:<code>[[Probability]]( engine_status = 'overheated' )</code> &rarr; 0.01
 +
 +
== Using on data sets ==
 +
 +
You can apply the function to a historical data set by specifying the index «I», which returns the fraction of data points that satisfy the condition.
 +
 +
=== Example ===
 +
 +
:<code>[[Probability]]( Historic_fuel_price < Current_fuel_price, Past_years )</code> &rarr; 0.25
 +
 +
== Weighted probability estimate ==
 +
 +
When the «w» parameter is specified, each point is assigned a different weight. The «w» parameter should be indexed by «I», or by [[Run]] when the «I» parameter is not specified. The «w» parameter defaults to the system variable [[SampleWeighting]] when «I» is [[Run]].
 +
 +
== Computing cumulative distribution functions ==
 +
 +
The [[Probability]] function can be a poor option for computing cumulative distribution functions. For example, you might be tempted to computed the cumulative distribution of a standard normal distribution at <code>z</code> using
 +
 +
:<code>[[Probability]]( Normal(0,1) <= z )</code>
 +
 +
Because this is returning the fraction of samples that are less than <code>z</code>, the estimate may have a substantial amount of sampling error. In addition, every value of <code>z</code> landing between the same two sample points has the same computed probability, which essentially makes the partial derivative at <code>z</code> equal to zero for points other than the sample points themselves, thus destroying information that may be needed if you need to solve a non-linear optimization (NLP). A much better option is to use an [[:Category:Analytic Distribution Functions|Analytic cumulative distribution function]], such as
 +
 +
:<code>CumNormal(z,0,1)</code>
 +
 +
== See also ==
 +
 +
* [[Statistical Functions and Importance Weighting]]
 +
* [[Frequency]]
 +
* [[:Category:Analytic Distribution Functions]]
 +
* [[GetFract]]
 +
* [[Frequency]]
 +
* [[Cdf]]

Revision as of 19:31, 3 September 2015


Probability( b, I, w )

Returns an estimate of the probability that «b» is true. This is computed as the fraction of samples for which the boolean condition «b» holds.

When the index «I» is not specified, the probability based on the uncertain chance variables in your model. In this case, the probability is taken over the Run index and «b» is evaluated in sample mode, so it is the probability based on the Monte Carlo sample.

Probability is equivalent to Mean(b<>False,I,w).

Example

Probability( engine_status = 'overheated' ) → 0.01

Using on data sets

You can apply the function to a historical data set by specifying the index «I», which returns the fraction of data points that satisfy the condition.

Example

Probability( Historic_fuel_price < Current_fuel_price, Past_years ) → 0.25

Weighted probability estimate

When the «w» parameter is specified, each point is assigned a different weight. The «w» parameter should be indexed by «I», or by Run when the «I» parameter is not specified. The «w» parameter defaults to the system variable SampleWeighting when «I» is Run.

Computing cumulative distribution functions

The Probability function can be a poor option for computing cumulative distribution functions. For example, you might be tempted to computed the cumulative distribution of a standard normal distribution at z using

Probability( Normal(0,1) <= z )

Because this is returning the fraction of samples that are less than z, the estimate may have a substantial amount of sampling error. In addition, every value of z landing between the same two sample points has the same computed probability, which essentially makes the partial derivative at z equal to zero for points other than the sample points themselves, thus destroying information that may be needed if you need to solve a non-linear optimization (NLP). A much better option is to use an Analytic cumulative distribution function, such as

CumNormal(z,0,1)

See also

Comments


You are not allowed to post comments.