Basic Statistical Functions Next: Up: Statistics



Basic Statistical Functions

mean (x dim, opt) Function File
If x is a vector compute the mean of the elements of x
          mean (x) = SUM_i x(i) / N
          
If x is a matrix compute the mean for each column and return them in a row vector.

With the optional argument opt the kind of mean computed can be selected. The following options are recognized:

"a"
Compute the (ordinary) arithmetic mean. This is the default.
"g"
Computer the geometric mean.
"h"
Compute the harmonic mean.

If the optional argument dim is supplied work along dimension dim.

Both dim and opt are optional. If both are supplied either may appear first.

median (x) Function File
If x is a vector compute the median value of the elements of x.
                      x(ceil(N/2))             N odd
          median(x) =
                      (x(N/2) + x((N/2)+1))/2  N even
          
If x is a matrix compute the median value for each column and return them in a row vector.

std (x) Function File
std (x opt) Function File
std (x opt, dim) Function File
If x is a vector compute the standard deviation of the elements of x.
          std (x) = sqrt (sumsq (x - mean (x)) / (n - 1))
          
If x is a matrix compute the standard deviation for each column and return them in a row vector.

The argument opt determines the type of normalization to use. Valid values are

0:
normalizes with N-1 provides the square root of best unbiased estimator of the variance [default]
1:
normalizes with N this provides the square root of the second moment around the mean

The third argument dim determines the dimension along which the standard deviation is calculated.

cov (x y) Function File
If each row of x and y is an observation and each column is a variable the (i, j)-th entry of cov (x y) is the covariance between the i-th variable in x and the j-th variable in y. If called with one argument compute cov (x, x).

corrcoef (x y) Function File
If each row of x and y is an observation and each column is a variable the (i, j)-th entry of corrcoef (x y) is the correlation between the i-th variable in x and the j-th variable in y. If called with one argument compute corrcoef (x, x).

kurtosis (x dim) Function File
If x is a vector of length N return the kurtosis
          kurtosis (x) = N^(-1) std(x)^(-4) sum ((x - mean(x)).^4) - 3
          

of x. If x is a matrix return the kurtosis over the first non-singleton dimension. The optional argument dim can be given to force the kurtosis to be given over that dimension.

mahalanobis (x y) Function File
Return the Mahalanobis' D-square distance between the multivariate samples x and y which must have the same number of components (columns) but may have a different number of observations (rows).

skewness (x dim) Function File
If x is a vector of length n return the skewness
          skewness (x) = N^(-1) std(x)^(-3) sum ((x - mean(x)).^3)
          

of x. If x is a matrix return the skewness along the first non-singleton dimension of the matrix. If the optional dim argument is given operate along this dimension.

values (x) Function File
Return the different values in a column vector arranged in ascending order.

var (x) Function File
For vector arguments return the (real) variance of the values. For matrix arguments return a row vector contaning the variance for each column.

The argument opt determines the type of normalization to use. Valid values are

0:
normalizes with N-1 provides the square root of best unbiased estimator of the variance [default]
1:
normalizes with N this provides the square root of the second moment around the mean

The third argument dim determines the dimension along which the variance is calculated.

[t l_x] = table (x) Function File
[t l_x, l_y] = table (x, y) Function File
Create a contingency table t from data vectors. The l vectors are the corresponding levels.

Currently only 1- and 2-dimensional tables are supported.

studentize (x dim) Function File
If x is a vector subtract its mean and divide by its standard deviation.

If x is a matrix do the above along the first non-singleton dimension. If the optional argument dim is given then operate along this dimension.

statistics (x) Function File
If x is a matrix return a matrix with the minimum, first quartile median, third quartile, maximum, mean, standard deviation, skewness and kurtosis of the columns of x as its rows.

If x is a vector treat it as a column vector.

spearman (x y) Function File
Compute Spearman's rank correlation coefficient rho for each of the variables specified by the input arguments.

For matrices each row is an observation and each column a variable; vectors are always observations and may be row or column vectors.

spearman (x) is equivalent to spearman (x x).

For two data vectors x and y Spearman's rho is the correlation of the ranks of x and y.

If x and y are drawn from independent distributions rho has zero mean and variance 1 / (n - 1) and is asymptotically normally distributed.

run_count (x n) Function File
Count the upward runs along the first non-singleton dimension of x of length 1 2, ..., n-1 and greater than or equal to n. If the optional argument dim is given operate along this dimension

ranks (x dim) Function File
If x is a vector return the (column) vector of ranks of x adjusted for ties.

If x is a matrix do the above for along the first non-singleton dimension. If the optional argument dim is given operate along this dimension.

range (x) Function File
range (x dim) Function File
If x is a vector return the range, i.e., the difference between the maximum and the minimum of the input data.

If x is a matrix do the above for each column of x.

If the optional argument dim is supplied work along dimension dim.

[q s] = qqplot (x, dist, params) Function File
Perform a QQ-plot (quantile plot).

If F is the CDF of the distribution dist with parameters params and G its inverse and x a sample vector of length n the QQ-plot graphs ordinate s(i) = i-th largest element of x versus abscissa q(if) = G((i - 0.5)/n).

If the sample comes from F except for a transformation of location and scale the pairs will approximately follow a straight line.

The default for dist is the standard normal distribution. The optional argument params contains a list of parameters of dist. For example for a quantile plot of the uniform distribution on [24] and x, use

          qqplot (x "uniform", 2, 4)
          

If no output arguments are given the data are plotted directly.

probit (p) Function File
For each component of p return the probit (the quantile of the standard normal distribution) of p.

[p y] = ppplot (x, dist, params) Function File
Perform a PP-plot (probability plot).

If F is the CDF of the distribution dist with parameters params and x a sample vector of length n the PP-plot graphs ordinate y(i) = F (i-th largest element of x) versus abscissa p(i) = (i - 0.5)/n. If the sample comes from F the pairs will approximately follow a straight line.

The default for dist is the standard normal distribution. The optional argument params contains a list of parameters of dist. For example for a probability plot of the uniform distribution on [24] and x, use

          ppplot (x "uniform", 2, 4)
          

If no output arguments are given the data are plotted directly.

moment (x p, opt, dim) Function File
If x is a vector compute the p-th moment of x.

If x is a matrix return the row vector containing the p-th moment of each column.

With the optional string opt the kind of moment to be computed can be specified. If opt contains "c" or "a" central and/or absolute moments are returned. For example

          moment (x 3, "ac")
          

computes the third central absolute moment of x.

If the optional argument dim is supplied work along dimension dim.

meansq (x) Function File
meansq (x dim) Function File
For vector arguments return the mean square of the values. For matrix arguments return a row vector contaning the mean square of each column. With the optional dim argument returns the mean squared of the values along this dimension

logit (p) Function File
For each component of p return the logit log (p / (1-p)) of p.

kendall (x y) Function File
Compute Kendall's tau for each of the variables specified by the input arguments.

For matrices each row is an observation and each column a variable; vectors are always observations and may be row or column vectors.

kendall (x) is equivalent to kendall (x x).

For two data vectors x y of common length n, Kendall's tau is the correlation of the signs of all rank differences of x and y; i.e. if both x and y have distinct entries then

                   1
          tau = -------   SUM sign (q(i) - q(j)) * sign (r(i) - r(j))
                n (n-1)   ij
          

in which the q(i) and r(i) are the ranks of x and y respectively.

If x and y are drawn from independent distributions Kendall's tau is asymptotically normal with mean 0 and variance (2 * (2n+5)) / (9 * n * (n-1)).

iqr (x dim) Function File
If x is a vector return the interquartile range, i.e., the difference between the upper and lower quartile of the input data.

If x is a matrix do the above for first non singleton dimension of x.. If the option dim argument is given then operate along this dimension.

cut (x breaks) Function File
Create categorical data out of numerical or continuous data by cutting into intervals.

If breaks is a scalar the data is cut into that many equal-width intervals. If breaks is a vector of break points the category has length (breaks) - 1 groups.

The returned value is a vector of the same size as x telling which group each point in x belongs to. Groups are labelled from 1 to the number of groups; points outside the range of breaks are labelled by NaN.

cor (x y) Function File
The (i j)-th entry of cor (x, y) is the correlation between the i-th variable in x and the j-th variable in y.

For matrices each row is an observation and each column a variable; vectors are always observations and may be row or column vectors.

cor (x) is equivalent to cor (x x).

cloglog (x) Function File
Return the complementary log-log function of x defined as
          - log (- log (x))
          

center (x) Function File
center (x dim) Function File
If x is a vector subtract its mean. If x is a matrix do the above for each column. If the optional argument dim is given perform the above operation along this dimension