Excel to Analytica Mappings/Statistical Functions

< Excel to Analytica Mappings
Revision as of 19:59, 9 February 2012 by Lchrisman (talk | contribs) (GAMMA.INV)

AVEDEV(x1,x2,...)

Analytica equivalents:

Mean(x-Mean(x))
Mean(x-Mean(x,I),I)

AVERAGE(x1,x2,...), AVERAGEA(x1,x2,...)

Analytica equivalent:

Average(x,I)

AVERAGEA

AVERAGEIF

AVERAGEIFS

BETADIST(x,alpha,beta,A,B)

Analytica equivalents:

BetaI(x,alpha,beta)
BetaI((x-A)/(B-A),alpha,beta) 

To define a variable as a beta probability distribution, use:

Beta(alpha,beta,A,B)

BETA.DIST(x,alpha,beta,cumulative,A,B)

BETAINV(p,alpha,beta,A,B)

This is named BETA.INV(p,alpha,beta,A,B) in Excel 2010 ,but the equivalent is the same.

Analytica equivalents:

BetaIInv(p,alpha,beta)
BetaIInv(p,alpha,beta) * (B-A) + A

To define a variable as a beta probability distribution, use:

Beta(alpha,beta,A,B)

BINOMDIST(x,n,p)

Analytica equivalent:

Prob_Binomial(x,n,p)

To use this function, add the Distribution Densities Library to your model.

To define a variable as binomially distributed, use:

Binomial(n,p)

For the cumulative binomial probability, BINOMDIST(x,n,p,TRUE), the Analytica equivalent is:

Probability(Binomial(n,p)<=x)

Note, however, that this is evaluated using sampling, so for small sample sizes there could be some sampling error in the result.

BINOM.DIST(number_s,trials,probability_s,cumulative)

BINOM.INV(trials,probability_s,alpha)

CHIDIST(x,dof) or CHISQ.DIST(x,deg_freedom,cumulative)

Analytica equivalent for CHIDIST or CHISQ.DIST when cumulative is false:

Dens_ChiSquared(x,dof)

Equivalent of CHISQ.DIST when cumulative is true:

CumChiSquared(x,dof)

Both Dens_ChiSquared and CumChiSquared are in the Distribution Densities Library.

CHIINV(p,dof) or CHISQ.INV(p,dof)

Analytica equivalent:

2 * GammaIInv(p,dof/2)

CHISQ.INV.RT(p,dof)

CHITEST(actual,expected) or CHISQ.TEST(actual,expected)

For the 1-D case, where actual and expected are each 1-D, in Analytica these will have a common index, I, and the Analytica equivalent is:

Var n := Sum(1,I);
Var chi2 := Sum( (actual-expected)^2 / expected, I );
1-GammaI(chi2/2,(n-1)/2)

For the 2-D contingency table analysis, with indexes I and J, the equivalent is:

Var n := Sum(1,I,J);
Var chi2 := Sum( (actual-expected)^2 / expected, I, J );
1-GammaI(chi2/2,(n-1)/2)

CONFIDENCE(alpha,sd,n) or CONFIDENCE.NORM(alpha,sd,n)

Analytica equivalent:

CumNormalInv( 0.5 + (1-alpha)/2, 0, sd ) / Sqrt(n)

CONFIDENCE.T(alpha,sd,n)

Analytica equivalent:

-CumStudentTInv(alpha/2,n-1) * sd / Sqrt(n)

Note: CumStudentTInv is in the Distribution Densities library.

CORREL(x,y)

Analytica equivalents:

Correlation(x,y)
Correlation(x,y,I)

Use the first form, without the index, when measuring correlation across the Run index (e.g., the Monte Carlo uncertainty dimension).

COUNT(value1,value2,...), COUNTA(value1,value2,...)

Analytica equivalents, e.g.,:

Sum( IsNumber(x), I )
Sum( x<>Null, I )

COUNTBLANK

The notion of a blank Excel cell does not directly translate, but in Analytica we would generally consider a cell of an array to be blank when its value is null. Thus, the Analytica equivalent would be:

Sum( x=null, I )

COUNTIF(range,criteria)

The Analytica equivalent consists of expressing criteria as a boolean expression and then using:

Sum(criteria,I)

For example, to count the number of values greater than 55, use:

Sum(range>55,I)

COUNTIFS(range1,criteria1,range2,critieria2,range3,criteria3,...)

Most cases where you would use this function in Excel involve collections of data that are logically 3 or 4 dimensional, but which have to be broken into multiple tables because of Excel's intrinsic restriction to two dimensions. Hence, multiple ranges are naturally handled via array abstraction without a need for special handling.

For example, in an array named range with 4 dimensions: I,J,K and L, to count the number of values greater than 55 you would use:

Sum(range>55,I,J,K,L)

If the criteria varies along one or more of the dimensions, then your criteria would simply be indexed by that. For example, if you have separate bounds for each value along L, you would create an array, Lb, indexed by L and containing the lower bounds, and then use:

Sum(range>lb,I,J,K,L)

COVAR(array1,array2) or COVARIANCE.S(array1,array2)

Analytica equivalent:

CoVariance(array1,array2,I)

COVARIANCE.P(array1,array2)

Analytica equivalent:

Var n := Sum(1,I);
CoVariance(array1,array2,I) * (n-1) / n

CRITBINOM

DEVSQ(number1,number2,...)

Analytica equivalents:

Sum( (X-Mean(X,I))^2, I )
Variance(X,I) * (Sum(1,I)-1)

where the numbers are in array X indexed by I.

EXPONDIST(x,lambda,cumulative) or EXPON.DIST(x,lambda,cumulative)

If cumulative is false, the Analytica equivalent is:

Dens_Exponential(x,1/lambda)

When cumulative is true, the Analytica equivalent is:

CumExponential(x,1/lambda)

Note: These functions are in the Distribution Densities library

F.DIST(x,dof1,dof2,cumulative)

When cumulative is true, the Analytica equivalent is:

CumFDist(x,dof1,dof2)

When cumulative is false:

Dens_FDist(x,dof1,dof2)

Note: CumFDist and Dens_FDist are in the Distribution Densities library

FDIST(x,dof1,dof2) or F.DIST.RT(x,dof1,dof2)

The Analytica equivalent is

1-CumFDist(x,dof1,dof2)

Note: CumFDist is the Distribution Densities library

F.INV(p,dof1,dof2)

The Analytica equivalent is:

CumFDistInv(p,dof1,dof2)

FINV(p,dof1,dof2) or F.INV.RT(p,dof1,dof2)

The Analytica equivalent is:

CumFDistInv(1-p,dof1,dof2)

F.TEST(array1,array2)

The Analytica equivalent, assuming «array1» is indexed by I and «array2» is indexed by J is:

Var F0 := Variance(array1,I) / Variance(array2,J);
Var F  := Max([F0,1/F0]);
2 * (1-CumFDist(F,Sum(1,I)-1,Sum(1,J)-1) )

FISHER(x)

The Analytica equivalent is:

Ln( (1+x)/(1-x) ) / 2

FISHERINV(y)

The Analytica equivalent is:

Var e2y := Exp(2*y) Do (e2y-1)/(e2y+1)

FORECAST(x,known_y,known_x)

Analytica equivalent (Index I an the index in common with known_x and known_y):

Index K := ['b','m'];
Var B := Array(K,[1,known_x]);
Var Bx := Array(K,[1,x]);
Sum(Regression(known_y,B,I,K) * Bx, K )

FREQUENCY(data_array,bins_array)

Analytica equivalentd:

Frequency(data_array,bins_array)
Frequency(data_array,bins_array,I)

The second form is used when data_array is indexed by I. The first form takes the frequency of the uncertain sample, i.e., along the Run index.

FTEST(array1,array2)

The following User-Defined Function provides an Analytica equivalent:

Function FTest( A1 : ContextSamp[I], A2 : ContextSamp[J] ; I,J : Index = Run )
Definition:
   Var v1 := Variance(A1,I);
   Var v2 := Variance(A2,J);
   Var n1 := Sum(1,I);
   Var n2 := Sum(1,J);
   Var F := v2/v1;
   1 - CumFDist(F,n1-1,n2-1)

Add the above UDF to your model, then you can just use:

FTest(array1,array2,I,J)

where I and J are the indexes of array1 and array2 respectively.

GAMMADIST(x,a,b,cumulative) or GAMMA.DIST(x,a,b,cumulative)

Analytica equivalent of GAMMADIST(x,a,b):

Dens_Gamma(x,a,b)

Analytica equivalent of GAMMADIST(x,a,b,TRUE):

GammaI(x,a,b)

To define a variable as uncertain with a gamma probability distribution, use:

Gamma(a,b)

Note: Dens_Gamma is in the Distribution Densities library

GAMMAINV(p,a,b) or GAMMA.INV(p,a,b)

Analytica equivalent:

GammaIInv(p,a,b)

GAMMALN(x)

Analytica equivalent:

LGamma(x)

GEOMEAN

GROWTH

HARMEAN

HYPGEOMDIST(sample_s,number_sample,population_s,number_pop)

The Analytica equivalent is:

Prob_HyperGeometric(sample_s,number_sample,population,number_pop)

INTERCEPT(known_y,known_x)

Assume known_y and known_x share index I. The Analytica equivalent is:

Index K := ['b','m'];
Regression( known_y, Array(K,[1,known_x]), I, K ) [ K='b' ]

KURT(number1,number2,...)

Analytica equivalents:

Kurtosis(X)
Kurtosis(X,I)

LARGE(array,k)

Assume array is indexed by I. One possible Analytica equivalent is:

Var n := Sum(1,I);
X[I=ArgMax(Rank(array,I)=n+1-k,I)]

LINEST

LOGEST

LOGINV(p,lm,lsd)

The Analytica equivalent is:

CumLogNormalInv(p,Exp(lm),Exp(lsd))

LOGNORMDIST(x,lm,ls)

The Analytica equivalent is:

CumLogNormal(x,Exp(gm),Exp(gs))

MAX(number1,number2,...)

When MAX is applied in Excel to individual numbers, the Analytica equivalent is:

Max( [number1,number2,...] )

Take note of the square brackets.

When MAX is used in Excel by providing it with a range of cells, the Analytica equivalent is

Max(A,I)

where A is the array of values (analogous to Excel's range) and I is the index to take the max over. In the two-D case, where you want the Max over a 2-D region, this becomes:

Max(A,I,J)

MAXA

MEDIAN(number1,number2,...)

When MEDIAN is applied in Excel to individual numbers, the Analytica equivalent is:

Index x := [number1,number2,...];
GetFract( x,0.5,x )

In the more usual case, when MEDIAN is used in Excel by providing it with a range of cells, the Analytica equivalent is

GetFract(A,0.5,I)

where A is the array of values (analogous to Excel's range) and I is the index to take the median over.

MIN(number1,number2,...)

When MIN is applied in Excel to individual numbers, the Analytica equivalent is:

Min( [number1,number2,...] )

Take note of the square brackets.

When MIN is used in Excel by providing it with a range of cells, the Analytica equivalent is

Min(A,I)

where A is the array of values (analogous to Excel's range) and I is the index to take the minimum over. In the two-D case, where you want the minimum over a 2-D region, this becomes:

Min(A,I,J)

MINA

MODE(range)

Assuming the values in Analytica are in an array A indexed by I, the equivalent is:

Index V := Unique(Va1,Va1);
ArgMax(Frequency( Va1, V, Va1 ),V)

NEGBINOMDIST

NORMDIST(x,mean,standard_dev,cumulative)

If Cumulative=True, the Analytica equivalent is:

CumNormal(x,mean,standard_dev)

If Cumulative=False, the Analytica equivalent is:

Dens_Normal(x,mean,standard_dev)

Note that to use Dens_Normal, you must include the distribution densities library in your model.

NORMINV(p,mean,standard_dev)

Analytica equivalent:

CumNormalInv(p,mean,standard_dev)

NORMSDIST(z)

Analytica equivalent:

CumNormal(z)

NORMSINV(p)

Analytica equivalent:

 CumNormalInv(p)

PEARSON(array1,array2)

Analytica equivalent:

Correlation(array1,array2,I)

where the data points in array1 and array2 both are indexed by I.


PERCENTILE(array,p)

Assume that the array of values is indexed by I. Then the Analytica equivalent is:

GetFract(array,p,I)

PERCENTRANK(array,x,significance)

Returns the rank of a value in a data set as a percentage of the data set. The optional parameters, significance, defaults to 3 in Excel. Assume your array is A and is indexed by I. If you want the percent rank for all elements of A to full precision use:

(Rank(A,I)-1) / (Size(I)-1)

If you want the percent rank for all elements, but only to the indicated significance, use:

Floor( (Rank(A,I)-1) / (Size(I)-1), significance )

If you want to get the percent rank of a single element, use:

Floor( Sum(A<x,I) / (Size(I)-1), significance )

PERMUT(n,k)

Analytica equivalent:

Permutations(n,k)

POISSON(x,mean,cumulative)

When cumulative is false, the Analytica equivalent is

Prob_Poisson(x,mean)

When cumulative is true, the Analytica equivalent is

CumPoisson(x,mean)

In Analytica models, you'll often use the Poisson distribution function directly, rather than evaluating the probability or cumulative probability at a given point.

PROB(x_range,prob_range,lower_limit,upper_limit)

In Analytica, x_range and prob_range will share an index, call it I. When upper_limit is not specified, then the Analytica equivalent is:

Sum( prob_range * (lower_limit <= x_range ), I )

When upper_limit is specified, this becomes

Sum( prob_range * (lower_limit <= x_range and x_range <= upper_limit), I )

QUARTILE(array,quart)

Analytica equivalent, where array is assumed indexed by I:

GetFract(array,quart/4,I)

RANK(number,range,order)

Assume in Analytica that the array identified by range is indexed by I. Then when order is non-zero, the Analytica equivalent is:

Rank(range,I,type:-1)[I=number]

When order is omitted or zero in Excel, the Analytica equivalent is:

Rank(-range,I,type:1)[I=number]

RSQ(known_y,known_x)

Analytica equivalent:

Correlation(known_y,known_x,I)^2

SKEW(number1,number2,...)

Analytica equivalent (assume the numbers are in array X, indexed by I):

Skewness(X,I)

When X is an uncertain distribution, indexed by Run, the equivalent is:

Skewness(X)

SLOPE(known_y,known_x)

Assume known_y and known_x share index I. The Analytica equivalent is:

Index K := ['b','m'];
Regression( known_y, Array(K,[1,known_x]), I, K ) [ K='m' ]


SMALL(array,k)

Assume array is indexed by I. One possible Analytica equivalent is:

X[I=ArgMax(Rank(array,I)=k,I)]


STANDARDIZE(x,m,sd)

Analytica equivalent:

(x-m)/sd

STDEV(number1,number2,...)

Analytica equivalent (assume that the numbers are in array X indexed by I):

SDeviation(X,I)

STDEVP

Analytica equivalent (assume that the numbers are in array X indexed by I):

SDeviation(X,I) * (Sum(1,I)-1)/Sum(1,I)

STEYX(known_ys,known_xs)

To express the Analytica equivalent, I recommend first introducting an index and a function to your model:

Index Bm := ['b','m']
Function Basis_Bm(x) := Table(Bm)(1,x)

These functions make simple y=m*x+b linear regression very convenient. With these, the equivalent is, assuming the common index between known_ys and known_xs is I:

Var c := Regression(known_ys,Basis_Bm(known_xs),I,Bm);
known_ys - Sum( c * Basis_Bm(known_xs), Bm )

TDIST(x,dof,tails)

The Analytica equivalent is:

tails * (1-CumStudentT(x,dof))

TINV(p,dof)

Analytica equivalent:

CumStudentTInv( 1-p/2, dof )

Excel has a strange notion of a 2-tailed Student-T distribution, which basically multiplies the probability by 2. See its TDIST function. TINV seems to do the inverse of this 2-tailed case, which leads to this strange equivalence.

TREND(known_y,known_x,new_x,const)

Assume that the data in known_y and known_x are indexed by index I, and the points in new_x are indexed by J. Also assume that const is 1 (true) for a y=m*x+b curve, and 0 (false) for a y=m*b fit. Then the Analytica equivalent is:

Index K := ['b','m'];
Sum( Regression(known_y,Array(K,[const,known_x]),I,K) * Array(K,[const,new_x]), K )

The result is the set of predicted new_y values, indexed by J.

TRIMMEAN(array,percent)

Assume the array is indexed by I. The Analytica equivalent is

Var n := Sum(1,I);
Mean( A[I=SortIndex(A,I)], I, w:@I > percent*n/2 and @I < n+1-(percent*n/2) )

TTEST

VAR(number1,number2,...)

Analytica equivalent (assume numbers are in array X indexed by I):

Variance(X,I)

VARP

Analytica equivalent (assume numbers are in array X indexed by I):

Variance(X,I) * (Sum(1,I)-1)/Sum(1,I)


WEIBULL(x,alpha,beta,cumulative)

When cumulative=false, the Analytica equivalent is:

Dens_Weibull(x,alpha,beta)

When cumulative=true, the Analytica equivalent is:

CumWeibull(x,alpha,beta)

ZTEST

Comments


You are not allowed to post comments.