Difference between revisions of "Truncate"

Revision as of 23:10, 27 January 2016

Truncate(ux, xmin, xmax)

Returns a distribution with the shape of uncertain quantity «ux», truncated so that it has no values below «xmin» or above «xmax».

Declaration:

Truncate(ux: numeric sample; xmin, xmax: optional scalar)

Details

When evaluated in Mid mode, it returns an estimate of the median of the truncated distribution. It always evaluates «ux» probabilistically and «xmin» and «xmax» according to context. The function must be given either «xmin» or «xmax» or both. Otherwise, it will give an evaluation error.

When evaluated in Sample mode, the a re-sampling algorithm obtains sampleSize sample points between the bounds using an estimated shape of the original distribution (where the estimated shape is based on the original distribution's sample). This can be applied to any continuous distribution.

Special Cases

If all values of «ux» ≤ «xmin», it returns a sample = «xmin». Similarly, if all values of «ux» ≥ «xmax», it returns a sample = «xmax».

It flags an evaluation error if «xmin» > «xmax».

Truncate "semi-preserves" the rank-order of sample «ux»:

Given

Y = Truncate(X, xmin)

then

X[Run = i] < X[Run = j] ⇒

Y[Run = i] ≤ Y[Run = j]

Hence, if all values of X are unique, the ranks are preserved -- i.e. the ranks of sample Y will correspond to the ranks of sample X. If X contains some repeated values, the ranks may not be all preserved.

Possible Future Enhancement

An optional flag might be added to return Max([xmin, Min([xmax, Mid(ux)])]) in Mid mode, rather than an estimate of the median. The median is the "correct" value according to Mid semantics, but has the disadvantage of requiring the original distribution to be evaluated in sample mode. In some large sampling applications, where explicit loops are programmed to avoid evaluating full samples in memory, this ability to avoid a full sample may offset (in some cases) the error from the approximation.

It is easy to insert your own UDF to perform this variant of truncate, e.g.:

Function ApproxTruncate(ux; xmin = -INF; xmax = INF)

Definition: if IsSampleEvalMode Then Truncate(ux, xmin, xmax) Else Max([xmin, Min([xmax, ux]))

@@ Line 1: / Line 1: @@
-= Function Truncate =
+[[Category:Distribution Functions]]
 [[Category:Doc Status C]] <!-- For Lumina use, do not change -->
-Returns a distribution with the shape of uncertain quantity '''ux''', truncated so that it has no values below '''xmin''' or above '''xmax'''.
+==Truncate(ux, ''xmin, xmax'')==
+Returns a distribution with the shape of uncertain quantity «ux», truncated so that it has no values below «xmin» or above «xmax».
-== Declaration ==
+Declaration:
+:[[Truncate]](ux: numeric sample; xmin, xmax: optional scalar)
-  Truncate(ux: numeric sample; xmin, xmax: optional scalar)
+== Details ==
+When evaluated in [[Mid]] mode, it returns an estimate of the median of the truncated distribution. It always evaluates «ux» probabilistically and «xmin» and «xmax» according to context.  The function must be given either «xmin» or «xmax» or both. Otherwise, it will give an evaluation error.
-== Details ==
+When evaluated in [[Sample]] mode, the a re-sampling algorithm obtains [[sampleSize]] sample points between the bounds using an estimated shape of the original distribution (where the estimated shape is based on the original distribution's sample).  This can be applied to any continuous distribution.
-When evaluated in Mid mode, it returns an estimate of the median of the truncated distribution. It always evaluates ux probabilistically and xmin and xmax according to context.  The function must be given either '''xmin''' or '''xmax''' or both: Otherwise, it will give an evaluation error.
+== Special Cases ==
+If all values of «ux» &le; «xmin», it returns a sample = «xmin». Similarly, if all values of «ux» &ge; «xmax», it returns a sample = «xmax».
-When evaluated in Sample mode, the a re-sampling algorithm obtains sampleSize sample points between the bounds using an estimated shape of the original distribution (where the estimated shape is based on the original distribution's sample).  This can be applied to any continuous distribution.
+It flags an evaluation error if «xmin» > «xmax».
-== Special Cases ==
+[[Truncate]] "semi-preserves" the rank-order of sample «ux»:
-If all values of '''ux''' <= '''xmin''', it returns a sample = '''xmin'''. Similarly, if all values of ux >= '''xmax''', it returns a sample = '''xmax'''.
+Given
+:<code>Y = Truncate(X, xmin)</code>
-It flags an evaluation error if '''xmin''' > '''xmax'''.
+then
+:<code>X[Run = i] < X[Run = j] &rArr; </code>
+:<code>Y[Run = i] &le; Y[Run = j]</code>
-Truncate() "semi-preserves" the rank-order of sample ux: Given Y = Truncate(X, xmin), then X[Run=i] < X[Run=j] ==> Y[Run=i] <= Y[Run=j]. Hence, if all values of X are unique, the ranks are preserved -- i.e. the ranks of sample Y will correspond to the ranks of sample X.  If X contains some repeated values, the ranks may not be all preserved.
+Hence, if all values of <code>X</code> are unique, the ranks are preserved -- i.e. the ranks of sample <code>Y</code> will correspond to the ranks of sample <code>X</code>.  If <code>X</code> contains some repeated values, the ranks may not be all preserved.
 == Possible Future Enhancement ==
+An optional flag might be added to return <code>Max([xmin, Min([xmax, Mid(ux)])])</code> in [[Mid]] mode, rather than an estimate of the [[median]].  The median is the "correct" value according to Mid semantics, but has the disadvantage of requiring the original distribution to be evaluated in sample mode.  In some large sampling applications, where explicit loops are programmed to avoid evaluating full samples in memory, this ability to avoid a full sample may offset (in some cases) the error from the approximation.
-An optional flag might be added to return Max([xmin,Min([xmax,Mid(ux)])]) in Mid mode, rather than an estimate of the median.  The median is the "correct" value according to Mid semantics, but has the disadvantage of requiring the original distribution to be evaluated in sample mode.  In some large sampling applications, where explicit loops are programmed to avoid evaluating full samples in memory, this ability to avoid a full sample may offset (in some cases) the error from the approximation.
+It is easy to insert your own [[User-Defined Functions|UDF]] to perform this variant of truncate, e.g.:
-It is easy to insert your own [[User-Defined Functions|UDF]] to perform this variant of truncate, e.g.:
+:<code>Function ApproxTruncate(ux; xmin = -INF; xmax = INF)</code>
+:<code>Definition: if IsSampleEvalMode Then Truncate(ux, xmin, xmax) Else Max([xmin, Min([xmax, ux]))</code>
- Function ApproxTruncate( ux; xmin=-INF ; xmax=INF )
+==See Also==
- Definition: if IsSampleEvalMode Then Truncate(ux,xmin,xmax) Else Max([xmin,Min([xmax,ux]))
+* [[Run]]
+* [[Special probabilistic functions]]
+* [[Distribution Densities Library]]