Log-normal distribution

Revision as of 20:55, 18 January 2016 by Bbecane (talk | contribs)


LogNormal(median, gsdev)

Generates a sample with a lognormal distribution with given «median» and «gsdev» (geometric standard deviation). The logarithm of a lognormal random variable has a normal distribution.

A normal distribution is symmetric around its mean:

If x := Normal(mean, sdev), then P(x <= mean - sdev) = P(x >= mean + sdev) = .15.

Analogously, a lognormal distribution is ratio-symmetric around its median:

If y := LogNormal(median, gsdev), then P(y <= median/gsdev) = P(y >= median*gsdev) = .15.

Lognormal actually has four parameters, «median», «gsdev»' (geometric standard deviation), «mean», «stddev» (standard deviation). You can specify any two of them, which are sufficient to specify the rest.

LogNormal(median: med, gsdev: gs) or just LogNormal(med, gs)
LogNormal(median: med, stddev: sd)
LogNormal(median: med, mean: mu)
LogNormal(mean: mu, stddev: s)
LogNormal(mean: mu, gsdev: gs)
LogNormal(gsdev: gs, stddev: sd)

If you specify more than two parameters, it will give an error. If you specify no parameters, it will default to standard lognormal -- i.e. whose natural logarithm is a unit normal, mean 0 and standard deviation 1.

Like other distributions, you can also give one or more «Over» indexes. These cause it to generate an array of independent lognormal distributions over the specified index(es). For example,

LogNormal(m, gsd, Over: i)

Syntax:

LogNormal(median, gsdev, mean, stddev: Optional Positive; over: ... Optional Atom)

Parameter Estimation

Suppose X contains sampled historical data indexed by I, and consisting solely of positive values. To estimate the parameters of the best-fit LogNormal distribution, the following parameter estimation formulae can be used:

«median» := Median(X, I) or Exp(Mean(Ln(X), I))
«gsdev» := Exp(SDeviation(Ln(X), I))

A more general form, with one extra degree-of-freedom, is the LogNormal with an offset, i.e.,:

LogNormal(median, gsdev) - offset

The more general form can be adapted to data sets containing negative numbers. The offset is constrained so that

offset > -Min(X, I)

To my knowledge, a closed form formula for offset does not exist, so that finding the optimal value of offset requires a 1-D search or optimization. However, I have found that the following heuristic estimation formulae comes extremely close to the best-fit parameters with offset:

offset := -Min(X, I) + 2*(Median(X, I) - Min(X, I))/Sum(1, I)
median := Median(X + offset, I)
gsdev := Exp(SDeviation(Ln(X + offset), I))

See Also

Comments


You are not allowed to post comments.