Difference between revisions of "Importance weights"

Latest revision as of 00:35, 22 May 2018

Analytica User GuideProbability DistributionsImportance weights

Importance weighting is a powerful enhancement to Monte Carlo and Latin hypercube simulation that lets you get more useful information from fewer samples. It is especially valuable for risky situations with a small probability of an extremely good or bad outcome. By default, all simulation samples are equally likely. With importance weighting, you increase the number of samples in these areas of special interest (good or bad) by a factor f. You assign inverse weights 1/f to these sample points in SampleWeighting so that the resulting probability distribution is the same. Thus, you can get more detail where it matters and less where it matters less. Uncertainty views and statistical functions use SampleWeighting to reweight the samples so that the results are unbiased.

You can also modify SampleWeighting interactively to reflect different input distributions and so rapidly see the effects on results without having to rerun the simulation. By default, it uses equal weights, so you don’t have to worry about importance sampling unless you want to use it.

SampleWeighting: To set up importance weighting, you set weights to each sample point in the built-in variable SampleWeighting. Here is how to open its Object window:

De-select all nodes, e.g., by clicking in the background of the diagram.
From the Definition menu, select System Variables, and then SampleWeighting. Its Object window opens.

Initially, its definition is 1, meaning it has an equal weight of 1 for every sample. (1 is equivalent to an array of 1s, e.g., Array(Run, 1)). For importance weighting, you assign a different weighting array indexed by Run. It automatically normalizes the weighting to sum to one, so you need only supply relative weights.

Suppose you have a distribution on variable «X», with density function f(x), which has a small critical region in cr(x) — in which «x» causes a large loss or gain. To generate the distribution on «x», we use a mixture of f(x) and cr(x) with probability «p» for cr(x) and (1 - p) for f(x). Then use the SampleWeighting function to adjust the results back to what they should be is:

f(x) / ((p * f(x) + (1 - p) * cr(x))

For example, suppose you are selecting the design Capacity in Megawatts for an electrical power generation system for a critical facility to meet an uncertain Demand with a lognormal distribution:

Chance Demand := Lognormal(100, 1.5)

Decision Capacity := 240

Probability(Demand > Capacity) → 0.015

According to this simulation, the probability of demand exceeding capacity is 1.5%. Suppose the operator receives Price of 20 dollars per Megawatt-hour delivered, but must pay Penalty of $200/megawatt-hour of demand that it fails to supply:

Variable Price := 20

Variable Penalty := 200

Variable Revenue := IF Demand <= Capacity THEN Price*Demand

ELSE Price*Capacity - (Demand - Capacity)*Penalty

Mean (Revenue) → $2309

The estimated mean revenue of $2309 is imprecise because there is only a small (1.5%) probability of a large penalty ($200/Mwh that it cannot supply), and only a few sample points will be in this region. You can get a more accurate estimate by using importance sampling to increase the number of samples in the critical region, where Demand > Capacity):

Chance Excess_demand := Truncate(Demand, 200)

Variable Mix_prob := 0.6

Variable Weighted_demand := If Bernoulli(Mix_prob)

THEN Excess_demand ELSE Demand

Since we have increased the probability of sample points with demand exceeding supply, we need to adjust their weights inversely by dividing by the mixture density:

SampleWeighting := Density(Demand)

/ ((1 - Mix_prob)*Density(Demand) + Mix_prob*Density(Excess_demand))

Thus, we compute a Weighted_demand as a mixture between the original distribution on Demand and the distribution in the critical region, Excess_demand. We assign weights to SampleWeighting, using the Object window for SampleWeighting opened as described above. In the above, Density(Demand) is a placeholder for the actual target density function for your demand, and Density(Excess_demand) a placeholder for the truncated density.

For more on weighted statistics and conditional statistics, see Weighted statistics and w parameter.

@@ Line 3: / Line 3: @@
-'''Importance weighting''' is a powerful enhancement to Monte Carlo and Latin hypercube simulation that lets you get more useful information from fewer samples; it is especially valuable for risky situations with a small probability of an extremely good or bad outcome. By default, all simulation samples are equally likely. With importance weighting, you set [[SampleWeighting]] to generate more samples in the most important areas. Thus, you can get more detail where it matters and less where it matters less. Results showing probability distributions with uncertainty views and statistical functions reweight sample values using [[SampleWeighting]] so that the results are unbiased.
+'''Importance weighting''' is a powerful enhancement to Monte Carlo and Latin hypercube simulation that lets you get more useful information from fewer samples. It is especially valuable for risky situations with a small probability of an extremely good or bad outcome. By default, all simulation samples are equally likely. With importance weighting, you  increase the number of samples in these areas of special interest (good or bad) by a factor f. You assign inverse weights 1/f to these sample points in [[SampleWeighting]] so that the resulting probability distribution is the same. Thus, you can get more detail where it matters and less where it matters less. Uncertainty views and statistical functions use [[SampleWeighting]] to reweight the samples so that the results are unbiased.
-You can also modify [[SampleWeighting]] interactively to reflect different input distributions and so rapidly see the effects the effects on results without having to rerun the simulation. In the default mode, it uses equal weights, so you don’t have to worry about importance sampling unless you want to use it.
+You can also modify [[SampleWeighting]] interactively to reflect different input distributions and so rapidly see the effects on results without having to rerun the simulation. By default, it uses equal weights, so you don’t have to worry about importance sampling unless you want to use it.
 '''SampleWeighting:''' To set up importance weighting, you set weights to each sample point in the built-in variable <code>SampleWeighting</code>. Here is how to open its [[Object window]]:
@@ Line 18: / Line 18: @@
 Suppose you have a distribution on variable «X», with density function ''f(x)'', which has a small critical region in ''cr(x)'' — in which «x» causes a large loss or gain. To generate the distribution on «x», we use a mixture of ''f(x)'' and ''cr(x)'' with probability «p» for ''cr(x)'' and ''(1 - p)'' for ''f(x)''. Then use the [[SampleWeighting]] function to adjust the results back to what they should be is:
-:<code>f(x) / ((p f(x) + (1 - p) cr(x))</code>
+:<code>f(x) / ((p *  f(x) + (1 - p) * cr(x))</code>
-For example, suppose you are selecting the design <code>Capacity</code> in Megawatts for an electrical power generation system for a critical facility to meet an uncertain <code>Demand</code> in Megawatts which has a [[LogNormal|lognormal]] distribution:
+For example, suppose you are selecting the design <code>Capacity</code> in Megawatts for an electrical power generation system for a critical facility to meet an uncertain <code>Demand</code> with a [[LogNormal|lognormal]] distribution:
 :<code>Chance Demand := Lognormal(100, 1.5)</code>
 :<code>Decision Capacity := 240</code>
-:<code>Probability(Demand) &rarr; 0.015</code>
+:<code>Probability(Demand > Capacity) &rarr; 0.015</code>
-In other words, the probability of failing to meet demand is about 1.5%, according to the probabilistic simulation of the lognormal distribution. Suppose the operator receives <code>Price</code> of 20 dollars per Megawatt-hour delivered, but must pay <code>Penalty</code> of 200 dollars per megawatt-hour of demand that it fails to supply to its customers:
+According to this simulation,  the probability of demand exceeding capacity is 1.5%. Suppose the operator receives <code>Price</code> of 20 dollars per Megawatt-hour delivered, but must pay <code>Penalty</code> of $200/megawatt-hour of demand that it fails to supply:
-:<code>Variable Price := 100</code>
+:<code>Variable Price := 20</code>
-:<code>Variable Penalty := 1000</code>
+:<code>Variable Penalty := 200</code>
 :<code>Variable Revenue := IF Demand <= Capacity THEN Price*Demand</code>
 ::<code>ELSE Price*Capacity - (Demand - Capacity)*Penalty</code>
 :<code>Mean (Revenue) &rarr; $2309</code>
-The estimated mean revenue of $2309 is imprecise because there is a small (1.5%) probability of a large penalty ($200 per Mwh that it cannot supply), and only a few sample points will be in this region. You can use Importance sampling to increase the number of samples in the critical region, where <code>Demand > Capacity</code>).
+The estimated mean revenue of $2309 is imprecise because there is only a small (1.5%) probability of a large penalty ($200/Mwh that it cannot supply), and only a few sample points will be in this region. You can get a more accurate estimate by using importance sampling to increase the number of samples in the critical region, where <code>Demand > Capacity</code>):
-:<code>Chance Excess_demand := Truncate(Demand, 150)</code>
+:<code>Chance Excess_demand := Truncate(Demand, 200)</code>
 :<code>Variable Mix_prob := 0.6</code>
 :<code>Variable Weighted_demand := If Bernoulli(Mix_prob)</code>
 ::<code>THEN Excess_demand ELSE Demand</code>
-:<code>SampleWeighting := Density(Demand) /</code>
-::<code>((1 - Mix_prob)*Density(Demand) +</code>
-::<code>Mix_prob*Density(Excess_demand))</code>
-Thus, we compute a <code>Weighted_demand</code> as a mixture between the original distribution on Demand and the distribution in the critical region, <code>Excess_demand</code>. We assign weights to <code>SampleWeighting</code>, using the [[Object window]] for <code>SampleWeighting</code> opened as described above.
+Since we have increased the probability of sample points with demand exceeding supply, we need to adjust their weights inversely by dividing by the mixture density:
+:<code>SampleWeighting := Density(Demand)</code>
+::<code>/ ((1 - Mix_prob)*Density(Demand) + Mix_prob*Density(Excess_demand))</code>
+Thus, we compute a <code>Weighted_demand</code> as a mixture between the original distribution on Demand and the distribution in the critical region, <code>Excess_demand</code>. We assign weights to <code>SampleWeighting</code>, using the [[Object window]] for <code>SampleWeighting</code> opened as described above. In the above, <code>Density(Demand)</code> is a placeholder for the actual target density function for your demand, and <code>Density(Excess_demand)</code> a placeholder for the truncated density.
 For more on weighted statistics and conditional statistics, see [[Weighted statistics and w parameter]].
 ==See Also==
 * [[Statistical Functions and Importance Weighting]]
 * [[Weighted statistics and w parameter]]
-* [[System Variables]]
+* [[SampleWeighting]]
+* [[Importance analysis]] is a different concept, a method for doing sensitivity analysis.
 <footer>Multivariate distributions / {{PAGENAME}} / Statistics, Sensitivity, and Uncertainty Analysis</footer>