Expressing Uncertainty
Analytica makes it easy to model and analyze uncertainties even if you have minimal background
in probability and statistics. The graphs below review several key concepts from probability and statistics to help you understand the probabilistic modeling facilities in Analytica. This chapter assumes that you have encountered most of these concepts before, but possibly in the distant past. If you need more information, see “Glossary” on page 447 or refer to an introductory text on probability and statistics.
Choosing an appropriate distribution
With Analytica, you can express uncertainty about any variable by using a probability distribution. You can base the distribution on available relevant data, on the judgment of a knowledgeable individual, or on some combination of data and judgment. Answer the following questions about the uncertain quantity to select the most appropriate kind of distribution:
- Is it discrete or continuous?
- If continuous, is it bounded?
- Does it have one mode or more than one?
- Is it symmetric or skewed?
- Should you use a standard or a custom distribution?
Is the quantity discrete or continuous?
When trying to express uncertainty about a quantity, the first technical question is whether the quantity is discrete or continuous.
A discrete quantity has a finite number of possible values — for example, the gender of a person or the country of a person’s birth. Logical or Boolean variables are a type of discrete variable
with only two values, true or false, sometimes coded as yes or no, present or absent, or 1 or 0 — for example, whether a person was born before January 1, 1950, or whether a person has ever
resided in California.
A continuous quantity can be represented by a real number, and has infinitely many possible
values between any two values in its domain. Examples are the quantity of an air pollutant
released during a given period of time, the distance in miles of a residence from a source of air
pollution, and the volume of air breathed by a specified individual during one year.
For a large discrete quantity, such as the number of humans residing within 50 miles of Disneyland
on December 25, 1980, it is often convenient to treat it as continuous. Even though you know
that the number of live people must be an integer, you might want to represent uncertainty about
the number with a continuous probability distribution.
Conversely, it is often convenient to treat continuous quantities as discrete by partitioning the set
of possible values into a small finite set of partitions. For example, instead of modeling human
age by a continuous quantity between 0 and 120, it is often convenient to partition people into
infants (age < 2 years), children (3 to 12), teenagers (13 to 19), young adults (20 to 40), middleaged
(41 to 65), and seniors (over 65 years). This process is termed discretizing. It is often convenient
to discretize continuous quantities before assessing probability distributions.
Enable comment auto-refresher