sn.em {sn}R Documentation

Fitting Skew-normal variables using the EM algorithm

Description

Fits a skew-normal (SN) distribution to data, or fits a linear regression model with skew-normal errors, using the EM algorithm to locate the MLE estimate. The estimation procedure can be global or it can fix some components of the parameters vector.

Usage

sn.em(X, y, fixed, p.eps=0.0001, l.eps=0.01, trace=FALSE, data=FALSE)

Arguments

y a vector contaning the observed variable. This is the response variable in case of linear regression.
X a matrix of explanatory variables. If X is missing, then a one-column matrix of all 1's is created. If X is supplied, and an intercept term is required, then it must include a column of 1's.
fixed a vector of length 3, indicating which components of the parameter vector must be regarded as fixed. In fixed=c(NA,NA,NA), which is the default setting, a global maximization is performed. If the 3rd component is given a value, then maximization is performed keeping that value fixed for the shape parameter. If the 3rd and 2nd parameters are fixed, then the scale and the shape parameter are kept fixed. No other patterns of the fixed values are allowed.
p.eps numerical value which regulates the parameter convergence tolerance.
l.eps numerical value which regulates the log-likelihood convergence tolerance.
trace logical value which controls printing of the algorithm convergence. If trace=TRUE, details are printed. Default value is F.
data logical value. If data=TRUE, the returned list includes the original data. Default value is data=FALSE.

Details

The function works using the direct parametrization; on convergence, the output is then given in both parametrizations.

This function is based on the EM algorithm; it is generally quite slow, but it appears to be very robust. See sn.mle for an alternative method, which also returns standard errors.

Value

a list with the following components:

dp a vector of the direct parameters, as explained in the references below.
cp a vector of the centred parameters, as explained in the references below.
logL the log-likelihood at congergence.
data optionally (if data=TRUE), a list containing X and y, as supplied on input, and a vector of residuals, which should have an approximate SN distribution with location=0 and scale=1, in the direct parametrization.

Background

Background information on the SN distribution is given by Azzalini (1985). See Azzalini and Capitanio (1999) for a more detailed discussion of the direct and centred parametrizations.

References

Azzalini, A. (1985). A class of distributions which includes the normal ones. Scand. J. Statist. 12, 171-178.

Azzalini, A. and Capitanio, A. (1999). Statistical applications of the multivariate skew-normal distribution. J.Roy.Statist.Soc. B 61, 579–602.

See Also

dsn, sn.mle, cp.to.dp

Examples

data(ais, package="sn")
attach(ais)
#
a<-sn.em(y=bmi)
#
a<-sn.em(X=cbind(1,lbm,lbm^2),y=bmi)
#
M<-model.matrix(~lbm+I(ais$sex))
b<-sn.em(M,bmi)
#
fit <- sn.em(y=bmi, fixed=c(NA, 2, 3), l.eps=0.001)

[Package sn version 0.4-4 Index]