OpenOffice Calc User Guide
Previous Page	Home	Next Page

Statistical analysis functions

Calc includes over 70 statistical functions which enable the evaluation of data from simple arithmetic calculations, such as averaging, to advanced distribution and probability computations.Several other statistics-based functions are available through the Add-ins which are noted at the end of this appendix.

Table 13: Statistical analysis functions

Syntax	Description
AVEDEV(number1; number2; ... number_30)	Returns the average of the absolute deviations of data points from their mean. Displays the diffusion in a data set. Number_1; number_2; ... number_30 are values or ranges that represent a sample. Each number can also be replaced by a reference.
AVERAGE(number_1; number_2; ... number_30)	Returns the average of the arguments. Number_1; number_2; ... number_30 are numerical values or ranges. Text is ignored.
AVERAGEA(value_1; value_2; ... value_30)	Returns the average of the arguments. The value of a text is 0. Value_1; value_2; ... value_30 are values or ranges.
B(trials; SP; T_1; T_2)	Returns the probability of a sample with binomial distribution. Trials is the number of independent trials. SP is the probability of success on each trial. T_1 defines the lower limit for the number of trials. T_2 (optional) defines the upper limit for the number of trials.
BETADIST(number; alpha; beta; start; end)	Returns the cumulative beta probability density function. Number is the value between Start and End at which to evaluate the function. Alpha is a parameter to the distribution. Beta is a parameter to the distribution. Start (optional) is the lower bound for number. End (optional) is the upper bound for number.
BETAINV(number; alpha; beta; start; end)	Returns the inverse of the cumulative beta probability density function. Number is the value between Start and End at which to evaluate the function. Alpha is a parameter to the distribution. Beta is a parameter to the distribution. Start (optional) is the lower bound for number. End (optional) is the upper bound for number
BINOMDIST(X; trials; SP; C)	Returns the individual term binomial distribution probability. X is the number of successes in a set of trials. Trials is the number of independent trials. SP is the probability of success on each trial. C = 0 calculates the probability of a single event and C = 1 calculates the cumulative probability.
CHIDIST(number; degrees_freedom)	Returns the probability value that a hypothesis will be confirmed from the indicated chi square. The probability determined by CHIDIST can also be determined by CHITEST. Number is the chi-square value of the random sample used to determine the error probability. Degrees_freedom is the degrees of freedom of the experiment.
CHIINV(number; degrees_freedom)	Returns the inverse of the one-tailed probability of the chi-squared distribution. Number is the value of the error probability. Degrees_freedom is the degrees of freedom of the experiment.
CHITEST(data_B; data_E)	Returns the chi-square distribution from a random distribution of two test series based on the chi-square test for independence. The probability determined by CHITEST can also be determined with CHIDIST, in which case the chi square of the random sample must then be passed as a parameter instead of the data row. Data_B is the array of the observations. Data_E is the range of the expected values.
CONFIDENCE(alpha; STDEV; size)	Returns the (1-alpha) confidence interval for a normal distribution. Alpha is the level of the confidence interval. STDEV is the standard deviation for the total population. Size is the size of the total population.
CORREL(data_1; data_2)	Returns the correlation coefficient between two data sets. Data_1 is the first data set. Data_2 is the second data set.
COUNT(value_1; value_2; ... value_30)	Counts how many numbers are in the list of arguments. Text entries are ignored. Value_1; value_2; ... value_30 are values or ranges which are to be counted.
COUNTA(value_1; value_2; ... value_30)	Counts how many values are in the list of arguments. Text entries are also counted, even when they contain an empty string of length 0. If an argument is an array or reference, empty cells within the array or reference are ignored. value_1; value_2; ... value_30 are up to 30 arguments representing the values to be counted.
COVAR(data_1; data_2)	Returns the covariance of the product of paired deviations. Data_1 is the first data set. Data_2 is the second data set.
CRITBINOM(trials; SP; alpha)	Returns the smallest value for which the cumulative binomial distribution is less than or equal to a criterion value. Trials is the total number of trials. SP is the probability of success for one trial. Alpha is the threshold probability to be reached or exceeded.
DEVSQ(number_1; number_2; ... number_30)	Returns the sum of squares of deviations based on a sample mean. Number_1; number_2; ... number_30 are numerical values or ranges representing a sample.
EXPONDIST(number; lambda; C)	Returns the exponential distribution. Number is the value of the function. Lambda is the parameter value. C is a logical value that determines the form of the function. C = 0 calculates the density function, and C = 1 calculates the distribution.
FDIST(number; degrees_freedom_1; degrees_freedom_2)	Calculates the values of an F probability distribution. Number is the value for which the F distribution is to be calculated. Degrees_freedom_1 is the degrees of freedom in the numerator in the F distribution. Degrees_freedom_2 is the degrees of freedom in the denominator in the F distribution.
FINV(number; degrees_freedom_1; degrees_freedom_2)	Returns the inverse of the F probability distribution. Number is probability value for which the inverse F distribution is to be calculated. Degrees_freedom_1 is the number of degrees of freedom in the numerator of the F distribution. Degrees_freedom_2 is the number of degrees of freedom in the denominator of the F distribution.
FISHER(number)	Returns the Fisher transformation for the given number and creates a function close to a normal distribution.
FISHERINV(number)	Returns the inverse of the Fisher transformation for the given number and creates a function close to a normal distribution.
FORECAST(value; data_Y; data_X)	Extrapolates future values based on existing x and y values. Value is the x value, for which the y value of the linear regression is to be returned. Data_Y is the array or range of known yâ€™s. Data_X is the array or range of known xâ€™s. Does not work for exponential functions.
FTEST(data_1; data_2)	Returns the result of an F test. Data_1 is the first record array. Data_2 is the second record array.
GAMMADIST(number; alpha; beta; C)	Returns the values of a Gamma cumulative distribution. Number is the value for which the Gamma distribution is to be calculated. Alpha is the parameter Alpha of the Gamma distribution. Beta is the parameter Beta of the Gamma distribution. C = 0 calculates the density function, and C = 1 calculates the distribution.
GAMMAINV(number; alpha; beta)	Returns the inverse of the Gamma cumulative distribution. This function allows you to search for variables with different distribution. Number is the probability value for which the inverse Gamma distribution is to be calculated. Alpha is the parameter Alpha of the Gamma distribution. Beta is the parameter Beta of the Gamma distribution.
GAMMALN(number)	Returns the natural logarithm of the Gamma function, G(x), for the given number.
GAUSS(number)	Returns the standard normal cumulative distribution for the given number.
GEOMEAN(number_1; number_2; ... number_30)	Returns the geometric mean of a sample. Number_1; number_2; ... number_30 are numerical arguments or ranges that represent a random sample.
HARMEAN(number_1; number_2; ... number_30)	Returns the harmonic mean of a data set. Number_1; number_2; ... number_30 are values or ranges that can be used to calculate the harmonic mean.
HYPGEOMDIST(X; n_sample; successes; n_population)	Returns the hypergeometric distribution. X is the number of results achieved in the random sample. N_sample is the size of the random sample. Successes is the number of possible results in the total population. N_population is the size of the total population.
INTERCEPT(data_Y; data_X)	Calculates the y-value at which a line will intersect the y-axis by using known x-values and y-values. Data_Y is the dependent set of observations or data. Data_X is the independent set of observations or data. Names, arrays or references containing numbers must be used here. Numbers can also be entered directly.
KURT(number_1; number_2; ... number_30)	Returns the kurtosis of a data set (at least 4 values required). Number_1; number_2; ... number_30 are numerical arguments or ranges representing a random sample of distribution.
LARGE(data; rank_c)	Returns the Rank_c-th largest value in a data set. Data is the cell range of data. Rank_c is the ranking of the value (2nd largest, 3rd largest, etc.) written as an integer.
LOGINV(number; mean; STDEV)	Returns the inverse of the lognormal distribution for the given Number, a probability value. Mean is the arithmetic mean of the standard logarithmic distribution. STDEV is the standard deviation of the standard logarithmic distribution.
LOGNORMDIST(number; mean; STDEV)	Returns the cumulative lognormal distribution for the given Number, a probability value. Mean is the mean value of the standard logarithmic distribution. STDEV is the standard deviation of the standard logarithmic distribution.
MAX(number_1; number_2; ... number_30)	Returns the maximum value in a list of arguments. Number_1; number_2; ... number_30 are numerical values or ranges.
MAXA(value_1; value_2; ... value_30)	Returns the maximum value in a list of arguments. Unlike MAX, text can be entered. The value of the text is 0. Value_1; value_2; ... value_30 are values or ranges.
MEDIAN(number_1; number_2; ... number_30)	Returns the median of a set of numbers. Number_1; number_2; ... number_30 are values or ranges, which represent a sample. Each number can also be replaced by a reference.
MIN(number_1; number_2; ... number_30)	Returns the minimum value in a list of arguments. Number_1; number_2; ... number_30 are numerical values or ranges.
MINA(value_1; value_2; ... value_30)	Returns the minimum value in a list of arguments. Here text can also be entered. The value of the text is 0. Value_1; value_2; ... value_30 are values or ranges.
MODE(number_1; number_2; ... number_30)	Returns the most common value in a data set. Number_1; number_2; ... number_30 are numerical values or ranges. If several values have the same frequency, it returns the smallest value. An error occurs when a value does not appear twice.
NEGBINOMDIST(X; R; SP)	Returns the negative binomial distribution. X is the value returned for unsuccessful tests. R is the value returned for successful tests. SP is the probability of the success of an attempt.
NORMDIST(number; mean; STDEV; C)	Returns the normal distribution for the given Number in the distribution. Mean is the mean value of the distribution. STDEV is the standard deviation of the distribution. C = 0 calculates the density function, and C = 1 calculates the distribution.
NORMINV(number; mean; STDEV)	Returns the inverse of the normal distribution for the given Number in the distribution. Mean is the mean value in the normal distribution. STDEV is the standard deviation of the normal distribution.
NORMSDIST(number)	Returns the standard normal cumulative distribution for the given Number.
NORMSINV(number)	Returns the inverse of the standard normal distribution for the given Number, a probability value.
PEARSON(data_1; data_2)	Returns the Pearson product moment correlation coefficient r. Data_1 is the array of the first data set. Data_2 is the array of the second data set.
PERCENTILE(data; alpha)	Returns the alpha-percentile of data values in an array. Data is the array of data. Alpha is the percentage of the scale between 0 and 1.
PERCENTRANK(data; value)	Returns the percentage rank (percentile) of the given value in a sample. Data is the array of data in the sample.
PERMUT(count_1; count_2)	Returns the number of permutations for a given number of objects. Count_1 is the total number of objects. Count_2 is the number of objects in each permutation.
PERMUTATIONA(count_1; count_2)	Returns the number of permutations for a given number of objects (repetition allowed). Count_1 is the total number of objects. Count_2 is the number of objects in each permutation.
PHI(number)	Returns the values of the distribution function for a standard normal distribution for the given Number.
POISSON(number; mean; C)	Returns the Poisson distribution for the given Number. Mean is the middle value of the Poisson distribution. C = 0 calculates the density function, and C = 1 calculates the distribution.
PROB(data; probability: start; end)	Returns the probability that values in a range are between two limits. Data is the array or range of data in the sample. Probability is the array or range of the corresponding probabilities. Start is the start value of the interval whose probabilities are to be summed. End (optional) is the end value of the interval whose probabilities are to be summed. If this parameter is missing, the probability for the Start value is calculated.
QUARTILE(data; type)	Returns the quartile of a data set. Data is the array of data in the sample. Type is the type of quartile. (0 = Min, 1 = 25%, 2 = 50% (Median), 3 = 75% and 4 = Max.)
RANK(value; data; type)	Returns the rank of the given Value in a sample. Data is the array or range of data in the sample. Type (optional) is the sequence order, either ascending (0) or descending (1).
RSQ(data_Y; data_X)	Returns the square of the Pearson correlation coefficient based on the given values. Data_Y is an array or range of data points. Data_X is an array or range of data points.
SKEW(number_1; number_2; ... number_30)	Returns the skewness of a distribution. Number_1; number_2; ... number_30 are numerical values or ranges.
SLOPE(data_Y; data_X)	Returns the slope of the linear regression line. Data_Y is the array or matrix of Y data. Data_X is the array or matrix of X data.
SMALL(data; rank_c)	Returns the Rank_c-th smallest value in a data set. Data is the cell range of data. Rank_c is the rank of the value (2nd smallest, 3rd smallest, etc.) written as an integer.
STANDARDIZE(number; mean; STDEV)	Converts a random variable to a normalized value. Number is the value to be standardized. Mean is the arithmetic mean of the distribution. STDEV is the standard deviation of the distribution.
STDEV(number_1; number_2; ... number_30)	Estimates the standard deviation based on a sample. Number_1; number_2; ... number_30 are numerical values or ranges representing a sample based on an entire population.
STDEVA(value_1; value_2; ... value_30)	Calculates the standard deviation of an estimation based on a sample. Value_1; value_2; ... value_30 are values or ranges representing a sample derived from an entire population. Text has the value 0.
STDEVP(number_1; number_2; ... number_30)	Calculates the standard deviation based on the entire population. Number_1; number_2; ... number_30 are numerical values or ranges representing a sample based on an entire population.
STDEVPA(value_1; value_2; ... value_30)	Calculates the standard deviation based on the entire population. Value_1; value_2; ... value_30 are values or ranges representing a sample derived from an entire population. Text has the value 0.
STEYX(data_Y; data_X)	Returns the standard error of the predicted y value for each x in the regression. Data_Y is the array or matrix of Y data. Data_X is the array or matrix of X data.
TDIST(number; degrees_freedom; mode)	Returns the t-distribution for the given Number. Degrees_freedom is the number of degrees of freedom for the t-distribution. Mode = 1 returns the one-tailed test, Mode = 2 returns the two-tailed test.
TINV(number; degrees_freedom)	Returns the inverse of the t-distribution, for the given Number associated with the two-tailed t-distribution. Degrees_freedom is the number of degrees of freedom for the t-distribution.
TRIMMEAN(data; alpha)	Returns the mean of a data set without the Alpha proportion of data at the margins. Data is the array of data in the sample. Alpha is the proportion of the marginal data that will not be taken into consideration.
TTEST(data_1; data_2; mode; type)	Returns the probability associated with a Studentâ€™s t-Test. Data_1 is the dependent array or range of data for the first record. Data_2 is the dependent array or range of data for the second record. Mode = 1 calculates the one-tailed test, Mode = 2 the two- tailed test. Type of t-test to perform: paired (1), equal variance (homoscedastic) (2), or unequal variance (heteroscedastic) (3).
VAR(number_1; number_2; ... number_30)	Estimates the variance based on a sample. Number_1; number_2; ... number_30 are numerical values or ranges representing a sample based on an entire population.
VARA(value_1; value_2; ... value_30)	Estimates a variance based on a sample. The value of text is 0. Value_1; value_2; ... value_30 are values or ranges representing a sample derived from an entire population. Text has the value 0.
VARP(Number_1; number_2; ... number_30)	Calculates a variance based on the entire population. Number_1; number_2; ... number_30 are numerical values or ranges representing an entire population.
VARPA(value_1; value_2; .. .value_30)	Calculates the variance based on the entire population. The value of text is 0. Value_1; value_2; ... value_30 are values or ranges representing an entire population.
WEIBULL(number; alpha; beta; C)	Returns the values of the Weibull distribution for the given Number. Alpha is the Alpha parameter of the Weibull distribution. Beta is the Beta parameter of the Weibull distribution. C indicates the type of function: C= 0 the form of the function is calculated, C=1 the distribution is calculated.
ZTEST(data; number; sigma)	Returns the two-tailed P value of a z test with standard distribution. Data is the array of the data. Number is the value to be tested. Sigma (optional) is the standard deviation of the total population. If this argument is missing, the standard deviation of the sample is processed.

OpenOffice Calc User Guide
Previous Page	Home	Next Page