Difference between revisions of "Importance weights"
m |
|||
(12 intermediate revisions by 3 users not shown) | |||
Line 3: | Line 3: | ||
− | '''Importance weighting''' is a powerful enhancement to Monte Carlo and Latin hypercube simulation that lets you get more useful information from fewer samples | + | '''Importance weighting''' is a powerful enhancement to Monte Carlo and Latin hypercube simulation that lets you get more useful information from fewer samples. It is especially valuable for risky situations with a small probability of an extremely good or bad outcome. By default, all simulation samples are equally likely. With importance weighting, you increase the number of samples in these areas of special interest (good or bad) by a factor f. You assign inverse weights 1/f to these sample points in [[SampleWeighting]] so that the resulting probability distribution is the same. Thus, you can get more detail where it matters and less where it matters less. Uncertainty views and statistical functions use [[SampleWeighting]] to reweight the samples so that the results are unbiased. |
− | You can also modify | + | You can also modify [[SampleWeighting]] interactively to reflect different input distributions and so rapidly see the effects on results without having to rerun the simulation. By default, it uses equal weights, so you don’t have to worry about importance sampling unless you want to use it. |
− | '''SampleWeighting:''' To set up importance weighting, you set weights to each sample point in the built-in variable <code>SampleWeighting</code>. Here is how to open its Object window: | + | '''SampleWeighting:''' To set up importance weighting, you set weights to each sample point in the built-in variable <code>SampleWeighting</code>. Here is how to open its [[Object window]]: |
# De-select all nodes, e.g., by clicking in the background of the diagram. | # De-select all nodes, e.g., by clicking in the background of the diagram. | ||
− | # From the | + | # From the [[Definition menu]], select [[Definition menu (more)#System Variables submenu|System Variables]], and then '''SampleWeighting'''. Its [[Object window]] opens. |
− | [[File:Chapter15_28.png]] | + | :[[File:Chapter15_28.png]] |
− | Initially, its definition is 1, meaning it has an equal weight of 1 for every sample. (<code>1</code> is equivalent to an array of | + | Initially, its definition is <code>1</code>, meaning it has an equal weight of ''1'' for every sample. (<code>1</code> is equivalent to an array of ''1''s, e.g., <code>Array(Run, 1))</code>. For importance weighting, you assign a different weighting array indexed by <code>Run</code>. It automatically normalizes the weighting to sum to one, so you need only supply relative weights. |
− | Suppose you have a distribution on variable | + | Suppose you have a distribution on variable «X», with density function ''f(x)'', which has a small critical region in ''cr(x)'' — in which «x» causes a large loss or gain. To generate the distribution on «x», we use a mixture of ''f(x)'' and ''cr(x)'' with probability «p» for ''cr(x)'' and ''(1 - p)'' for ''f(x)''. Then use the [[SampleWeighting]] function to adjust the results back to what they should be is: |
− | : | + | :<code>f(x) / ((p * f(x) + (1 - p) * cr(x))</code> |
− | For example, suppose you are selecting the design <code>Capacity</code> in Megawatts for an electrical power generation system for a critical facility to meet an uncertain <code>Demand</code> | + | For example, suppose you are selecting the design <code>Capacity</code> in Megawatts for an electrical power generation system for a critical facility to meet an uncertain <code>Demand</code> with a [[LogNormal|lognormal]] distribution: |
− | + | :<code>Chance Demand := Lognormal(100, 1.5)</code> | |
− | + | :<code>Decision Capacity := 240</code> | |
− | + | :<code>Probability(Demand > Capacity) → 0.015</code> | |
− | + | According to this simulation, the probability of demand exceeding capacity is 1.5%. Suppose the operator receives <code>Price</code> of 20 dollars per Megawatt-hour delivered, but must pay <code>Penalty</code> of $200/megawatt-hour of demand that it fails to supply: | |
− | + | :<code>Variable Price := 20</code> | |
− | + | :<code>Variable Penalty := 200</code> | |
− | + | :<code>Variable Revenue := IF Demand <= Capacity THEN Price*Demand</code> | |
− | + | ::<code>ELSE Price*Capacity - (Demand - Capacity)*Penalty</code> | |
− | + | :<code>Mean (Revenue) → $2309</code> | |
− | The estimated mean revenue of $2309 is imprecise because there is a small (1.5%) probability of a large penalty ($200 | + | The estimated mean revenue of $2309 is imprecise because there is only a small (1.5%) probability of a large penalty ($200/Mwh that it cannot supply), and only a few sample points will be in this region. You can get a more accurate estimate by using importance sampling to increase the number of samples in the critical region, where <code>Demand > Capacity</code>): |
− | + | :<code>Chance Excess_demand := Truncate(Demand, 200)</code> | |
− | + | :<code>Variable Mix_prob := 0.6</code> | |
− | + | :<code>Variable Weighted_demand := If Bernoulli(Mix_prob)</code> | |
− | + | ::<code>THEN Excess_demand ELSE Demand</code> | |
− | |||
− | |||
− | |||
− | Thus, we compute a <code>Weighted_demand</code> as a mixture between the original distribution on Demand and the distribution in the critical region, <code>Excess_demand</code>. We assign weights to <code>SampleWeighting</code>, using the | + | Since we have increased the probability of sample points with demand exceeding supply, we need to adjust their weights inversely by dividing by the mixture density: |
+ | :<code>SampleWeighting := Density(Demand)</code> | ||
+ | ::<code>/ ((1 - Mix_prob)*Density(Demand) + Mix_prob*Density(Excess_demand))</code> | ||
+ | |||
+ | Thus, we compute a <code>Weighted_demand</code> as a mixture between the original distribution on Demand and the distribution in the critical region, <code>Excess_demand</code>. We assign weights to <code>SampleWeighting</code>, using the [[Object window]] for <code>SampleWeighting</code> opened as described above. In the above, <code>Density(Demand)</code> is a placeholder for the actual target density function for your demand, and <code>Density(Excess_demand)</code> a placeholder for the truncated density. | ||
For more on weighted statistics and conditional statistics, see [[Weighted statistics and w parameter]]. | For more on weighted statistics and conditional statistics, see [[Weighted statistics and w parameter]]. | ||
==See Also== | ==See Also== | ||
+ | |||
+ | * [[Statistical Functions and Importance Weighting]] | ||
+ | * [[Weighted statistics and w parameter]] | ||
+ | * [[SampleWeighting]] | ||
+ | * [[Importance analysis]] is a different concept, a method for doing sensitivity analysis. | ||
+ | |||
+ | |||
<footer>Multivariate distributions / {{PAGENAME}} / Statistics, Sensitivity, and Uncertainty Analysis</footer> | <footer>Multivariate distributions / {{PAGENAME}} / Statistics, Sensitivity, and Uncertainty Analysis</footer> |
Latest revision as of 00:35, 22 May 2018
Importance weighting is a powerful enhancement to Monte Carlo and Latin hypercube simulation that lets you get more useful information from fewer samples. It is especially valuable for risky situations with a small probability of an extremely good or bad outcome. By default, all simulation samples are equally likely. With importance weighting, you increase the number of samples in these areas of special interest (good or bad) by a factor f. You assign inverse weights 1/f to these sample points in SampleWeighting so that the resulting probability distribution is the same. Thus, you can get more detail where it matters and less where it matters less. Uncertainty views and statistical functions use SampleWeighting to reweight the samples so that the results are unbiased.
You can also modify SampleWeighting interactively to reflect different input distributions and so rapidly see the effects on results without having to rerun the simulation. By default, it uses equal weights, so you don’t have to worry about importance sampling unless you want to use it.
SampleWeighting: To set up importance weighting, you set weights to each sample point in the built-in variable SampleWeighting
. Here is how to open its Object window:
- De-select all nodes, e.g., by clicking in the background of the diagram.
- From the Definition menu, select System Variables, and then SampleWeighting. Its Object window opens.
Initially, its definition is 1
, meaning it has an equal weight of 1 for every sample. (1
is equivalent to an array of 1s, e.g., Array(Run, 1))
. For importance weighting, you assign a different weighting array indexed by Run
. It automatically normalizes the weighting to sum to one, so you need only supply relative weights.
Suppose you have a distribution on variable «X», with density function f(x), which has a small critical region in cr(x) — in which «x» causes a large loss or gain. To generate the distribution on «x», we use a mixture of f(x) and cr(x) with probability «p» for cr(x) and (1 - p) for f(x). Then use the SampleWeighting function to adjust the results back to what they should be is:
f(x) / ((p * f(x) + (1 - p) * cr(x))
For example, suppose you are selecting the design Capacity
in Megawatts for an electrical power generation system for a critical facility to meet an uncertain Demand
with a lognormal distribution:
Chance Demand := Lognormal(100, 1.5)
Decision Capacity := 240
Probability(Demand > Capacity) → 0.015
According to this simulation, the probability of demand exceeding capacity is 1.5%. Suppose the operator receives Price
of 20 dollars per Megawatt-hour delivered, but must pay Penalty
of $200/megawatt-hour of demand that it fails to supply:
Variable Price := 20
Variable Penalty := 200
Variable Revenue := IF Demand <= Capacity THEN Price*Demand
ELSE Price*Capacity - (Demand - Capacity)*Penalty
Mean (Revenue) → $2309
The estimated mean revenue of $2309 is imprecise because there is only a small (1.5%) probability of a large penalty ($200/Mwh that it cannot supply), and only a few sample points will be in this region. You can get a more accurate estimate by using importance sampling to increase the number of samples in the critical region, where Demand > Capacity
):
Chance Excess_demand := Truncate(Demand, 200)
Variable Mix_prob := 0.6
Variable Weighted_demand := If Bernoulli(Mix_prob)
THEN Excess_demand ELSE Demand
Since we have increased the probability of sample points with demand exceeding supply, we need to adjust their weights inversely by dividing by the mixture density:
SampleWeighting := Density(Demand)
/ ((1 - Mix_prob)*Density(Demand) + Mix_prob*Density(Excess_demand))
Thus, we compute a Weighted_demand
as a mixture between the original distribution on Demand and the distribution in the critical region, Excess_demand
. We assign weights to SampleWeighting
, using the Object window for SampleWeighting
opened as described above. In the above, Density(Demand)
is a placeholder for the actual target density function for your demand, and Density(Excess_demand)
a placeholder for the truncated density.
For more on weighted statistics and conditional statistics, see Weighted statistics and w parameter.
See Also
- Statistical Functions and Importance Weighting
- Weighted statistics and w parameter
- SampleWeighting
- Importance analysis is a different concept, a method for doing sensitivity analysis.
Enable comment auto-refresher