Monte Carlo and probabilistic simulation

Analytica User GuideExpressing UncertaintyMonte Carlo and probabilistic simulation

Probabilistic simulation means simulating probabilistic variables by selecting a random sample from each distribution. Analytica offers three sampling methods, Monte Carlo simulation, Median Latin hypercube (the default), and Random Latin hypercube. We describe each of them, and then explain how to select among them.

Monte Carlo sampling

The most widely used sampling method is known as Monte Carlo, named after the randomness prevalent in games of chance, such as at the famous casino in Monte Carlo. In this method, each of the m sample points for each uncertainty quantity, X, is generated at random from X with probability proportional to the probability density (or probability mass for discrete quantities) for X. Analytica uses the inverse cumulative method; it generates m uniform random values, u_i, for i = 1, 2,...m, between 0 and 1, using the specified random number method (see below). It then uses the inverse of the cumulative probability distribution to generate the corresponding values of X,

X_i where P() = u_i for i = 1, 2,...m

With the simple Monte Carlo method, each value of every random variable X in the model, including those computed from other random quantities, is a sample of m independent random values from the true probability distribution for X. You can therefore use standard statistical methods to estimate the accuracy of statistics, such as the estimated mean or fractiles of the distribution, as for example described in Selecting the Sample Size.

Median Latin hypercube

Median Latin hypercube sampling is the default method: It divides each uncertain quantity X into m equiprobable intervals, where m is the sample size. The sample points are the medians of the m intervals, that is, the fractiles

X_i where P() = (i - 0.5)/m, for i = 1, 2,...m.

These points are then randomly shuffled so that they are no longer in ascending order, to avoid nonrandom correlations among different quantities.

Random Latin hypercube

The random Latin hypercube method is similar to the median Latin hypercube method except that, instead of using the median of each of the m equiprobable intervals, it samples at random from each interval. With random Latin hypercube sampling, each sample is a true random sample from the distribution, as in simple Monte Carlo. However, the samples are not totally independent because they are constrained to have one sample from each of the n intervals.

Sobol sampling

(New to Analytica 5.0)

The Sobol Sampling method is a quasi-Monte Carlo method that attempts to spread points out evenly in probability space across multiple dimensions. Latin hypercube methods spread points out uniformly for each scalar quantity separately, but since each dimension is treated independently, the coverage of the multidimensional space may not be very uniform. Sobol sampling attempts to sample in a similar way as Latin hypercube, but in a way that coordinates across multiple scalar quantities simultaneously. It does this by applying Sobol sequences to each scalar quantity.

Sobol sampling comes with a much stronger convergence guarantee than pure Monte Carlo sampling. Monte Carlo sampling error converges as [math]\displaystyle{ O(1/\sqrt{n}) }[/math] whereas Sobol converges as [math]\displaystyle{ O(\log(n)^d / n) }[/math], where d is the number of uncertain scalar quantities. For a fixed d, Sobol's convergence rate at extremely large n starts to resemble [math]\displaystyle{ O(1/n) }[/math], often seen as the holy grail of simulation. However, since [math]\displaystyle{ \log(n)^d }[/math] is a very large number when d is even of moderate size, the guaranteed bound is not a pragmatic one.

Choosing a sampling method

The advantage of Latin hypercube methods is that they provide more uniform distributions of samples for each distribution than simple Monte Carlo sampling. Median Latin hypercube, since it uses the median of each equiprobable interval is even more uniformly distributed than random Latin hypercube. If you display the PDF of a variable that is defined as a single continuous distribution, or is dependent onjust one continuous uncertain variable, the distribution usually looks fairly smooth even with a small sample size (such as 20) with median Latin hypercube sampling -- where simple Monte Carlo results looks quite noisy.

The advantage of Latin hypercube in reducing noise reduces when the result depends on two or more uncertain quantities that have comparable effects on the result, with the noise increasing with the number of uncertain quantities performance of the Latin hypercube. For more than 5 or so uncertain quantities, Latin hypercube methods might not be discernibly better than simple Monte Carlo. Since the median Latin hypercube method is sometimes much better, and almost never worse than the others, Analytica uses it as the default method.

When not to use Latin hypercube sampling

Very rarely, median Latin hypercube can produce poor results -- when the model includes a periodic function (like a Sin function) and the period is similar to the size of the equiprobable intervals on the uncertain parameter. For example:

X := Uniform(1, Samplesize)

Y := Sin(2*Pi*X)

In this case, median Latin hypercube sampling gives very poor results --- so you should use random Latin hypercube or simple Monte Carlo, which avoids this problem. But, the vast majority of models have no periodic function of this kind, so you do not need to worry about the reliability of median Latin hypercube sampling.

A visual comparison

Some intuition about the sampling methods can be obtained by comparing the coverage in 2-D of each method. Ideally, samples would be generated so that the density of points is highly uniform in all parts of the space, with very little clumping of multiple samples and no large areas of the space without any sample points. To generate these scatter plots, we used a sample size of 1000 with:

Index Dim := [1,2]

Chance X := Uniform(0,1,over:Dim)

And plotted scatter with Dim as the Comparison index.

Pure Monte Carlo

Median Latin Hypercube

Random Latin Hypercube

Sobol Sampling