Expected value of information -- EVI, EVPI, and ESVI

The expected value of information (EVI) is the increase in expected value due to getting more information about an uncertain quantity. EVI is perhaps the most sophisticated method for sensitivity analysis. This page explains the EVI, EVPI (expected value of perfect information), and EVSI (expected value of sample information), and describes an Analytica library for efficient estimation of these quantities using Latin hypercube simulation.

Understanding the expected value of information (EVI)

At first, it may seem paradoxical that you can estimate the value of information about an uncertain quantity X before you find out the information -- e.g. the actual value of X. The key is that it's the expected value of information -- i.e. the increase in value due to making your decision after you learn the value of X, taking the expectation (or average) over your current uncertainty in X expressed as a probability distribution. Some people use the shortened name value of information or just VOI. We prefer the full name expected value of information or EVI to remind us that the value is expected -- i.e. taking the mean over a probability distribution.

If you are the decision maker and have to make a decision now, before you learn the value of X, as a good decision analyst, you would choose a decision, call it D1, that maximizes the expected value of your objective -- taking the expectation over all uncertainties. (The objective might be a measure of utility, or negative loss, depending on how you formulate the decision problem.) After you find out the actual value of X, you may choose a different decision, D2, which maximizes your expected objective given the value of X (taking expectation over other uncertainties than X). You don't yet know X, but you can estimate the expected value of information on X as the increase in expected value given X, taking the expectation (mean) over the probability distribution expressing you current uncertainty in X.

According to the tenets of decision theory, a rational person should want to maximize their expected utility. Utility is a way to define an objective to quantify a decision maker's preferences over outcomes. Utility may incorporate risk aversion or other attitude to risk, and may be a nonlinear function of monetary value. When talking about EVI, it is usual to use the word "value" for simplicity, even though we may actually mean utility or a (negative) loss function.

The EVI (and the related measures expected value of perfect information) assume that the decision maker will select a decision to maximize expected value, whether before or after getting new information -- that is behave as a rational person according to the tenets of decision theory. They also assume the decision problem formulated as a decision analysis, and so includes these three kinds of variable:

one or more decision variables that the decision maker can control,
one or more chance variables that are uncertain, whose uncertainty is expressed as a probability distribution,
an objective variable that defines the value or utility that you are trying to maximize, or the loss to minimize. The objective should be influenced by at least one decision variable and chance variable.

In Analytica, you should define these three kinds of variable using the appropriate class, Decision, Objective, and Chance.

Value of information library

There are two versions of the Value of Information library:

Expected value of info lib with examples.ana: The library with example applications to two decision models, the Plane catching model (what time should you leave hom to catch a flight?) and the TXC model (select level of control of emissions from an industrial plant to minimize control costs and mortality from air pollution).

Expected value of info lib.ana: Just the library without examples.

Click the file you want to download. It will open the library in Analytica (if you have Analytica installed and depending on your browser settings). You can then save the library into a folder and import it into a model for which you want to use these methods.

These are the key functions:

Function EVPI(v, d)

Given an uncertain value v for a set of two or more discrete decision options d, it returns the Expected Value of Perfect Information. The EVPI is the expected increase in value if we were able to select the decision d that maximizes v after learning the true value of all the uncertainties represented in the random sample of v. It is the difference betweent the expected value if we made the ""perfect decision"" i.e. knowing the value of v exactly relative to choosing the Bayes' decision, i.e. a value of d that maximizes the expected value of v over its current uncertainty.

If your model has losses L for decisions D, simply call EVPI(-L, D)

Function VPI(v, d)

Returns the Posterior Value of Perfect Information (CVPI) -- i.e. the probability distribution on the value of perfect information about all uncertain quantities in a decision problem with uncertain value v (could be utilities) for discrete decision options d. The PVPI is the increase in value v due to choosing the decision d that maximizes v[d] given the actual value of v relative to choosing the Bayes' decision, d*, i.e. the value of d to maximize the expected value of v. The more familiar expected value of perfect information (EVPI) is simply the expectation over VPI, i.e. Mean(VPI(v, d)).

If your model has losses L for decisions D, simply call PVPI(-L, D)

Function EVI_x(v, d, xVars)

Estimates the expected value of information (EVI) for each uncertain variable x in xVars, for a value (utility) v that is a function of discrete decision options d, and the variables in xVars. It returns values indexed by xVars. The value of the information on x is the difference in expected value of v if we make the decision knowing x versus the initial (Bayes') decision that maximizes expected v without knowing x, in both cases assuming the given probability distributions over the other variables in xVars.

«v» must be a variable (not just expression or value). If the model objective, say L, is a loss whose mean (expected value) is to be minimized, you should add another value V := -L, since this function assumes «v» is to be maximized.

It uses CVI_x_pc() to compute the conditional value of information for each percentile pc for each x. It then estimates the EVI using a piecewise linear density function fitted to the percentiles for each x.

Often information that x is near its expected value does not change the decision and so has no value, but more extreme values may add more value. For this reason, having more percentiles pc near extremes (near 0 and near 100%, and fewer in the middle near 50%.)

Function CVI_x(v, d, xVars, pc)

Estimates the conditional value of information (CVI) given a value function «v» and decision variable «d», given each uncertain variable «xi» in «xVars» is set to each of the «pc» percentile(s) of its distribution. The CVI is a probability distribution based on the uncertain values of all variables «xVars» other than «xi». The result is indexed by the variables in «xVars», the list of percentiles in «pc» (if an index), and Run (when evaluated in probablistic mode).

«v» must be a variable that depends on decision variable «d» and uncertain variables in «xVars», which should all be probabalistic. By default, «pc» is set to global index Percentiles_for_CVI.

The CVI is the difference in value if we make the decision dpxi that maximized the expected value of v knowing the value of uncertain variable xi=xipc (where xipc is the pc'th percentile of xi), compared to the Bayes decision dstar that maximizes expected value of v with no additional information, conditional on xi=xipc. In each case, we assume the variables in xVars (other than xi) remain uncertain with their specified distributions.

This method requires evaluation of «v» n times, where n = Size(d) * Size(xVars) * Size(pc) * SampleSize, so it could take a qhile. It estimates the total time based on time to evaluate ArgMax(Mean(v), d) (which requires Size(d) * SampleSize evaluations). If the estimate is greater than promptSecs, it prompts the user to asks if you want to reduce SampleSize. It shows a Progressbar while computing CVI for each variable in «xVars» .

Function EVIU(v, d)

Given an uncertain value v for a set of two or more discrete decision options d, it returns the Expected Value of Including Uncertainty. The EVIU is the difference in expected value given the Bayes' decision that maximizes expected the value of v given the probability distribution over v and given the decision ignoring uncertainty (diu) that maximizes v deterministically assuming it was fixed at its 'Mid' value -- i.e. assuming all uncertain variables on which v depends use the their Mid value -- usually the median.

Function EVIU_by_x(v, d, xvars)

The Expected Value of Including Uncertainty for each variable x in xvars, given variable v is value (utility) that is a function of discrete decisions d, and uncertaint variable xVars. The EVIU is the difference in expected value given the Bayes' decision that maximizes expected the value of v given the probability distribution over v given all xVars are represented as probability distributions, and for each x in xVars the expected value after setting x to its Mid value (usually its Median.)"

Computing EVPI

It is quite easy to estimate the expected value of perfect information (EVPI) in Analytica. Suppose you have a model that includes:

D a decision variable, which is each discrete -- i.e. a list of possible values
X one or more uncertain variables, each sdefined by probability distributions.
V an objective variable that is a function of D and X.

The EVPI is the increase in expected value from making the best decision when you know X relative to the decision you would make to maximize expected (mean) value before you know X -- i.e. The EVPI is simply the difference:

Variable  EVPI := Mean(Max(V, D)) - Max(Mean(V), D)

-- i.e. the difference between the expected value (Mean) when you maximize V over D (given the value of X), and the maximum over D of the expected value of V (before you know X). You can see this as the definition of function EVPI in the library.

The expected value of V for each value of D is simply: Mean(V) The "Bayes' decision" Db is the decision that maximizes expected value with no further information: Db := ArgMax(Mean(V), D) The expected value given the Bayes decision is: Max(Mean(V), D) The optimal decision if you know the value of uncertain quantities X is Xi Dxi is Decision Dxi := WhatIf(ArgMax(V, D), X, Xi) i.e. the value of D maximizes V given X has value xi.

The expected value of information EVI about just one quantity, Xi, is a bit more complicated to estimate. Some decision analysts claim that it is hard or impossible using a Monte Carlo simulation scheme treating some or all of the uncertain quantities as continuous, and so recommend converting continuous probability distributions to discrete distributions. The function EVI_x_2D(v, d, xVars) in the library below uses a two dimensional kind of Monte Carlo simulation that represents the uncertainty about each quantity Xi in xVars with one sample, and the uncertainty in the other quantities by another sample, resulting in an method that requires n^2 calculations of the model, where n is the Monte Carlo sample size. This is often too intense.

But, in fact, it is quite possible to estimate