Difference between revisions of "Negative binomial distribution"

(Rolled back to NegativeBinomial)
(description for Inv function)
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:Distribution Functions]]
 
[[Category:Distribution Functions]]
''(new to [[Analytica 4.5]])''
+
[[Category:Discrete distributions]]
 +
[[Category:Semi-bounded distributions]]
 +
[[Category:Unimodal distributions]]
 +
[[Category:Univariate distributions]]
 +
{{ReleaseBar}}
  
= NegativeBinomial(r,p) =
+
The negative binomial distribution is a discrete probability distribution that models the number of successes that occur before «r» failures, where each independent trial is a success with probability «p».  The sample values are non-negative integers.
  
The negative binomial distribution is a discrete probability distribution that models the number of successes that occur before «r» failures, where each independent trial is a success with probability «p».  The sample values are non-negative integers.
+
<center><code>3+NegativeBinomial(3, 30%)</code> &rarr; [[image:Number batters during inning.png]]</center>
  
The [[NegativeBinomial]] distribution can be considered to be one of the three basic discrete distributions on the non-negative integers, with [[Poisson]] and [[Binomial]] being the other two.  If we characterize discrete distributions according to the first two moments -- specifically how the variance compares to the mean -- then three distributions span the space of possibilities.  For the [[Binomial]] distribution the variance is less than the mean, for the [[Poisson]] they are equal, and for the [[NegativeBinomial]] distribution the variance is greater than the mean. Turning this around, if you are trying to decide which of the discrete distributions to use to describe an uncertain quantity and all you have is the first two moments, then you can chose between these three distributions based on whether the variance is less than, equal to, or greater than the mean.
+
The [[NegativeBinomial]] distribution can be considered to be one of the three basic discrete distributions on the non-negative integers, with [[Poisson]] and [[Binomial]] being the other two.  If we characterize discrete distributions according to the first two moments -- specifically how the variance compares to the [[mean]] -- then three distributions span the space of possibilities.  For the [[Binomial]] distribution the [[variance]] is less than the [[mean]], for the [[Poisson]] they are equal, and for the [[NegativeBinomial]] distribution the [[variance]] is greater than the [[mean]]. Turning this around, if you are trying to decide which of the discrete distributions to use to describe an uncertain quantity and all you have is the first two moments, then you can chose between these three distributions based on whether the variance is less than, equal to, or greater than the mean.
  
== Probability Function ==
+
== Functions ==
 +
=== Parameters ===
 +
* «r»: The number of failures before terminating.
 +
* «p»: The probability of a success.
  
 +
=== NegativeBinomial(r, p) ===
 +
=== <div id="ProbNegativeBinomial">Prob{{Release||5.1|_}}NegativeBinomial(k, r, p)</div>===
 +
{{Release||5.1|To use, add the [[Distribution Densities Library]] to your model.}}
 
The probability distribution function for the [[NegativeBinomial]] is:
 
The probability distribution function for the [[NegativeBinomial]] is:
  
:<math>P(x=k) = Combinations(k, k+r-1) * p^k * (1-p)^r</math>
+
:<math>P(x=k) = \binom{k+r-1}{k} * p^k * (1-p)^r</math>
  
The cumulative density is
+
=== <div id="CumNegativeBinomial">CumNegativeBinomial(k, r, p)</div> ===
 +
{{Release||5.1|To use, add the [[Distribution Densities Library]] to your model.}}
  
 +
Analytically computes the probability of seeing «k» or fewer successes by the time «r» failure occur when each independent [[Bernoulli]] trial has a probability of «p» of success. This is the cumulative probability function for the [[NegativeBinomial]] distribution.
 
:<math>P(x\ge k) = 1-BetaI(p,k+1,r)</math>
 
:<math>P(x\ge k) = 1-BetaI(p,k+1,r)</math>
  
= Library =
+
where [[BetaI]] is the incomplete beta function.
 +
 
 +
=== <div id="CumNegativeBinomInv">CumNegativeBinomInv(k, r, p)</div> ===
 +
{{Release||5.1|To use, add the [[Distribution Densities Library]] to your model.}}
  
Distributions
+
The inverse of the [[CumNegativeBinomial]](k, r, p) function.  This computes the value «k» such that the probability of a value sampled from a  [[NegativeBinomial]](r, p) distribution has a value of «k» or less is «u».
  
= Examples =
+
Note: The identifier abbreviates Binom so as to keep the identifier below 20 characters total.
  
 +
== Examples ==
 
A shoplifter has a 20% change of getting caught and convicted each time he commits the crime (hence, his probability of success is 80%).  The third conviction carries jail time.  The number of times he shoplifts and gets away with it before being thrown in jail is given by
 
A shoplifter has a 20% change of getting caught and convicted each time he commits the crime (hence, his probability of success is 80%).  The third conviction carries jail time.  The number of times he shoplifts and gets away with it before being thrown in jail is given by
:<code>[[NegativeBinomial]](3,80%)</code>
+
:<code>NegativeBinomial(3, 80%)</code>
 +
 
 
:[[image:Successful_shoplifts.png]]
 
:[[image:Successful_shoplifts.png]]
  
 
A certain baseball player hits a home run for every 12 times he is at bat.  To beat Barry Bond's record of 73 home runs in a single season, the number of times he would need to be at bat would be given by
 
A certain baseball player hits a home run for every 12 times he is at bat.  To beat Barry Bond's record of 73 home runs in a single season, the number of times he would need to be at bat would be given by
:<code>[[NegativeBinomial]](74,1-1/12) + 74</code>
+
:<code>NegativeBinomial(74, 1-1/12) + 74</code>
In this case, when we talk about the NegativeBinomial distribution modeling the number of successes before a failure, here we apply this by calling a non-home run "a success", which happens with probability <code>1-1/12</code>.  To beat Barry Bond's record of 73, we need 74 home runs, and the since the total number of at bats includes both successes and failures, we add the 74 homeruns to the result.  The cumulative distribution is shown in the next graph which shows that he has a 20% probability of beating the record if he can get to bat at least 800 times.
+
 
 +
In this case, when we talk about the [[NegativeBinomial]] distribution modeling the number of successes before a failure, here we apply this by calling a non-home run "a success", which happens with probability <code>1-1/12</code>.  To beat Barry Bond's record of 73, we need 74 home runs, and the since the total number of at bats includes both successes and failures, we add the 74 home runs to the result.  The cumulative distribution is shown in the next graph which shows that he has a 20% probability of beating the record if he can get to bat at least 800 times.
 +
 
 
:[[image:Times_at_bat.png]]
 
:[[image:Times_at_bat.png]]
  
= See Also =
+
== Alternate parameterizations ==
 +
 
 +
=== Mean and Dispersion ===
 +
 
 +
The negative binomial is sometimes parameterized by the mean <code>m</code> and <code>r</code>. This is the same <code>r</code> as in the standard parameterization above, but is harder to interpret as the number of failures when using this parameterization, and is instead called the ''dispersion parameter'', ''shape parameter'' or ''clustering coefficient'' <sup>[http://en.wikipedia.org/wiki/Negative_binomial_distribution#Alternative_formulations 1]</sup>.  With this parameterization, use
 +
:<code>[[NegativeBinomial]]( r, m / (m+r) )</code>
 +
 
 +
=== Mean and Variance ===
 +
 
 +
Given the mean <code>m</code> and variance <code>v</code>, use
 +
:<code>[[NegativeBinomial]]( m^2 / ( v - m), (v-m) / v )</code>
 +
 
 +
==History==
 +
[[NegativeBinomial]] was introduced in [[Analytica 4.5]], replacing the earlier function [[NegBinomial]].
 +
 
 +
The analytic distribution functions were added as built-in functions in [[Analytica 5.2]].
  
* [[CumNegativeBinomial]](k,r,p)
+
== See Also ==
* [[ProbNegativeBinomial]](k,r,p)
+
<div style="column-count:2;-moz-column-count:2;-webkit-column-count:2">
* [[CumNegativeBinomInv]](u,r,p)
+
* [[Binomial distribution]]
* [[Binomial]](n,p)
+
* [[Poisson distribution]]
* [[Poisson]](mean)
+
* [[BetaI]]
* [[BetaI]], [[BetaIaInv]], [[Combinations]]
+
* [[BetaIaInv]]
 +
* [[Combinations]]
 +
* [[Parametric discrete distributions]]
 +
* [[Distribution Densities Library]]
 +
</div>

Latest revision as of 19:56, 7 December 2018



Release:

4.6  •  5.0  •  5.1  •  5.2  •  5.3  •  5.4  •  6.0  •  6.1  •  6.2  •  6.3  •  6.4  •  6.5


The negative binomial distribution is a discrete probability distribution that models the number of successes that occur before «r» failures, where each independent trial is a success with probability «p». The sample values are non-negative integers.

3+NegativeBinomial(3, 30%)Number batters during inning.png

The NegativeBinomial distribution can be considered to be one of the three basic discrete distributions on the non-negative integers, with Poisson and Binomial being the other two. If we characterize discrete distributions according to the first two moments -- specifically how the variance compares to the mean -- then three distributions span the space of possibilities. For the Binomial distribution the variance is less than the mean, for the Poisson they are equal, and for the NegativeBinomial distribution the variance is greater than the mean. Turning this around, if you are trying to decide which of the discrete distributions to use to describe an uncertain quantity and all you have is the first two moments, then you can chose between these three distributions based on whether the variance is less than, equal to, or greater than the mean.

Functions

Parameters

  • «r»: The number of failures before terminating.
  • «p»: The probability of a success.

NegativeBinomial(r, p)

ProbNegativeBinomial(k, r, p)

The probability distribution function for the NegativeBinomial is:

[math]\displaystyle{ P(x=k) = \binom{k+r-1}{k} * p^k * (1-p)^r }[/math]

CumNegativeBinomial(k, r, p)

Analytically computes the probability of seeing «k» or fewer successes by the time «r» failure occur when each independent Bernoulli trial has a probability of «p» of success. This is the cumulative probability function for the NegativeBinomial distribution.

[math]\displaystyle{ P(x\ge k) = 1-BetaI(p,k+1,r) }[/math]

where BetaI is the incomplete beta function.

CumNegativeBinomInv(k, r, p)

The inverse of the CumNegativeBinomial(k, r, p) function. This computes the value «k» such that the probability of a value sampled from a NegativeBinomial(r, p) distribution has a value of «k» or less is «u».

Note: The identifier abbreviates Binom so as to keep the identifier below 20 characters total.

Examples

A shoplifter has a 20% change of getting caught and convicted each time he commits the crime (hence, his probability of success is 80%). The third conviction carries jail time. The number of times he shoplifts and gets away with it before being thrown in jail is given by

NegativeBinomial(3, 80%)
Successful shoplifts.png

A certain baseball player hits a home run for every 12 times he is at bat. To beat Barry Bond's record of 73 home runs in a single season, the number of times he would need to be at bat would be given by

NegativeBinomial(74, 1-1/12) + 74

In this case, when we talk about the NegativeBinomial distribution modeling the number of successes before a failure, here we apply this by calling a non-home run "a success", which happens with probability 1-1/12. To beat Barry Bond's record of 73, we need 74 home runs, and the since the total number of at bats includes both successes and failures, we add the 74 home runs to the result. The cumulative distribution is shown in the next graph which shows that he has a 20% probability of beating the record if he can get to bat at least 800 times.

Times at bat.png

Alternate parameterizations

Mean and Dispersion

The negative binomial is sometimes parameterized by the mean m and r. This is the same r as in the standard parameterization above, but is harder to interpret as the number of failures when using this parameterization, and is instead called the dispersion parameter, shape parameter or clustering coefficient 1. With this parameterization, use

NegativeBinomial( r, m / (m+r) )

Mean and Variance

Given the mean m and variance v, use

NegativeBinomial( m^2 / ( v - m), (v-m) / v )

History

NegativeBinomial was introduced in Analytica 4.5, replacing the earlier function NegBinomial.

The analytic distribution functions were added as built-in functions in Analytica 5.2.

See Also

Comments


You are not allowed to post comments.