Difference between revisions of "Truncate"

Latest revision as of 00:43, 10 February 2016

Truncate(ux, xmin, xmax)

Returns a distribution with the shape of uncertain quantity «ux», truncated so that it has no values below «xmin» or above «xmax».

Declaration:

Truncate(ux: numeric sample; xmin, xmax: optional scalar)

Details

When evaluated in Mid mode, it returns an estimate of the median of the truncated distribution. It always evaluates «ux» probabilistically and «xmin» and «xmax» according to context. The function must be given either «xmin» or «xmax» or both. Otherwise, it will give an evaluation error.

When evaluated in Sample mode, the a re-sampling algorithm obtains sampleSize sample points between the bounds using an estimated shape of the original distribution (where the estimated shape is based on the original distribution's sample). This can be applied to any continuous distribution.

Special Cases

If all values of «ux» ≤ «xmin», it returns a sample = «xmin». Similarly, if all values of «ux» ≥ «xmax», it returns a sample = «xmax».

It flags an evaluation error if «xmin» > «xmax».

Truncate "semi-preserves" the rank-order of sample «ux»:

Given

Y = Truncate(X, xmin)

then

X[Run = i] < X[Run = j] ⇒

Y[Run = i] ≤ Y[Run = j]

Hence, if all values of X are unique, the ranks are preserved -- i.e. the ranks of sample Y will correspond to the ranks of sample X. If X contains some repeated values, the ranks may not be all preserved.

Possible Future Enhancement

An optional flag might be added to return Max([xmin, Min([xmax, Mid(ux)])]) in Mid mode, rather than an estimate of the median. The median is the "correct" value according to Mid semantics, but has the disadvantage of requiring the original distribution to be evaluated in sample mode. In some large sampling applications, where explicit loops are programmed to avoid evaluating full samples in memory, this ability to avoid a full sample may offset (in some cases) the error from the approximation.

It is easy to insert your own UDF to perform this variant of truncate, e.g.:

Function ApproxTruncate(ux; xmin = -INF; xmax = INF)

Definition: if IsSampleEvalMode Then Truncate(ux, xmin, xmax) Else Max([xmin, Min([xmax, ux]))

@@ Line 1: / Line 1: @@
-= Function Truncate =
+[[Category:Distribution Functions]]
+[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
+==Truncate(ux, ''xmin, xmax'')==
+Returns a distribution with the shape of uncertain quantity «ux», truncated so that it has no values below «xmin» or above «xmax».
-Returns a distribution with the shape of uncertain quantity '''ux''', truncated so that it has no values below '''xmin''' or above '''xmax'''.
+Declaration:
+:[[Truncate]](ux: numeric sample; xmin, xmax: optional scalar)
-== Declaration ==
+== Details ==
+When evaluated in [[Mid]] mode, it returns an estimate of the median of the truncated distribution. It always evaluates «ux» probabilistically and «xmin» and «xmax» according to context.  The function must be given either «xmin» or «xmax» or both. Otherwise, it will give an evaluation error.
+When evaluated in [[Sample]] mode, the a re-sampling algorithm obtains [[sampleSize]] sample points between the bounds using an estimated shape of the original distribution (where the estimated shape is based on the original distribution's sample).  This can be applied to any continuous distribution.
- Truncate(ux: numeric sample; xmin, xmax: optional scalar)
+== Special Cases ==
+If all values of «ux» &le; «xmin», it returns a sample = «xmin». Similarly, if all values of «ux» &ge; «xmax», it returns a sample = «xmax».
-== Details ==
+It flags an evaluation error if «xmin» > «xmax».
-When evaluated in Mid mode, it returns an estimate of the median of the truncated distribution. It always evaluates ux probabilistically and xmin and xmax according to context.  The function must be given either '''xmin''' or '''xmax''' or both: Otherwise, it will give an evaluation error.
+[[Truncate]] "semi-preserves" the rank-order of sample «ux»:
-When evaluated in Sample mode, the a re-sampling algorithm obtains sampleSize sample points between the bounds using an estimated shape of the original distribution (where the estimated shape is based on the original distribution's sample).  This can be applied to any continuous distribution.
+Given
+:<code>Y = Truncate(X, xmin)</code>
-== Special Cases ==
+then
+:<code>X[Run = i] < X[Run = j] &rArr; </code>
+:<code>Y[Run = i] &le; Y[Run = j]</code>
-If all values of '''ux''' <= '''xmin''', it returns a sample = '''xmin'''. Similarly, if all values of ux >= '''xmax''', it returns a sample = '''xmax'''.
+Hence, if all values of <code>X</code> are unique, the ranks are preserved -- i.e. the ranks of sample <code>Y</code> will correspond to the ranks of sample <code>X</code>.  If <code>X</code> contains some repeated values, the ranks may not be all preserved.
-It flags an evaluation error if '''xmin''' > '''xmax'''.
+== Possible Future Enhancement ==
+An optional flag might be added to return <code>Max([xmin, Min([xmax, Mid(ux)])])</code> in [[Mid]] mode, rather than an estimate of the [[median]].  The median is the "correct" value according to Mid semantics, but has the disadvantage of requiring the original distribution to be evaluated in sample mode.  In some large sampling applications, where explicit loops are programmed to avoid evaluating full samples in memory, this ability to avoid a full sample may offset (in some cases) the error from the approximation.
-Truncate() "semi-preserves" the rank-order of sample ux: Given Y = Truncate(X, xmin), then X[Run=i] < X[Run=j] ==> Y[Run=i] <= Y[Run=j]. Hence, if all values of X are unique, the ranks are preserved -- i.e. the ranks of sample Y will correspond to the ranks of sample X.  If X contains some repeated values, the ranks may not be all preserved.
+It is easy to insert your own [[User-Defined Functions|UDF]] to perform this variant of truncate, e.g.:
-== Possible Future Enhancement ==
+:<code>Function ApproxTruncate(ux; xmin = -INF; xmax = INF)</code>
+:<code>Definition: if IsSampleEvalMode Then Truncate(ux, xmin, xmax) Else Max([xmin, Min([xmax, ux]))</code>
-An optional flag might be added to return Max([xmin,Min([xmax,Mid(ux)])]) in Mid mode, rather than an estimate of the median.  The median is the "correct" value according to Mid semantics, but has the disadvantage of requiring the original distribution to be evaluated in sample mode.  In some large sampling applications, where explicit loops are programmed to avoid evaluating full samples in memory, this ability to avoid a full sample may offset (in some cases) the error from the approximation.
+==See Also==
+* [[Run]]
+* [[Special probabilistic functions]]
+* [[Distribution Functions]]
+* [[Distribution Densities Library]]