cumulative {VGAM}R Documentation

Ordinal Regression with Cumulative Probabilities

Description

Fits a cumulative logit/probit/cloglog/cauchit/... regression model to an ordered (preferably) factor response.

Usage

cumulative(link = "logit", earg = list(),
           parallel = FALSE, reverse = FALSE,
           mv = FALSE, intercept.apply = FALSE)

Arguments

In the following, the response Y is assumed to be a factor with ordered values 1,2,...,M+1, so that M is the number of linear/additive predictors eta_j.

link Link function applied to the M cumulative probabilities. See Links for more choices.
earg List. Extra argument for the link function. See earg in Links for general information.
parallel A logical, or formula specifying which terms have equal/unequal coefficients.
reverse Logical. By default, the cumulative probabilities used are P(Y<=1), P(Y<=2), ..., P(Y<=M). If reverse is TRUE, then P(Y>=2), P(Y>=3), ..., P(Y>=M+1) will be used.
This should be set to TRUE for link= golf, polf, nbolf. For these links the cutpoints must be an increasing sequence; if reverse=FALSE for then the cutpoints must be an decreasing sequence.
mv Logical. Multivariate response? If TRUE then the input should be a matrix with values 1,2,...,L, where L is the number of levels. Each column of the matrix is a response, i.e., multivariate response. A suitable matrix can be obtained from Cut.
intercept.apply Logical. Whether the parallel argument should be applied to the intercept term. This should be set to TRUE for link= golf, polf, nbolf.

Details

By default, the non-parallel cumulative logit model is fitted, i.e.,

eta_j = logit(P[Y<=j])

where j=1,2,...,M and the eta_j are not constrained to be parallel. This is also known as the non-proportional odds model. If the logit link is replaced by a complementary log-log link (cloglog) then this is known as the proportional-hazards model.

In almost all the literature, the constraint matrices associated with this family of models are known. For example, setting parallel=TRUE will make all constraint matrices (except for the intercept) equal to a vector of M 1's. If the constraint matrices are equal, unknown and to be estimated, then this can be achieved by fitting the model as a reduced-rank vector generalized linear model (RR-VGLM; see rrvglm). Currently, reduced-rank vector generalized additive models (RR-VGAMs) have not been implemented here.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.

Warning

No check is made to verify that the response is ordinal; see ordered.

Note

The response should be either a matrix of counts (with row sums that are all positive), or a factor. In both cases, the y slot returned by vglm/vgam/rrvglm is the matrix of counts.

For a nominal (unordered) factor response, the multinomial logit model (multinomial) is more appropriate.

With the logit link, setting parallel=TRUE will fit a proportional odds model. Note that the TRUE here does not apply to the intercept term. In practice, the validity of the proportional odds assumption needs to be checked, e.g., by a likelihood ratio test. If acceptable on the data, then numerical problems are less likely to occur during the fitting, and there are less parameters. Numerical problems occur when the linear/additive predictors cross, which results in probabilities outside of (0,1); setting parallel=TRUE will help avoid this problem.

Here is an example of the usage of the parallel argument. If there are covariates x1, x2 and x3, then parallel = TRUE ~ x1 + x2 -1 and parallel = FALSE ~ x3 are equivalent. This would constrain the regression coefficients for x1 and x2 to be equal; those of the intercepts and x3 would be different.

In the future, this family function may be renamed to ``cups'' (for cumulative probabilities) or ``cute'' (for cumulative probabilities).

Author(s)

Thomas W. Yee

References

Agresti, A. (2002) Categorical Data Analysis, 2nd ed. New York: Wiley.

Dobson, A. J. (2001) An Introduction to Generalized Linear Models, 2nd ed. Boca Raton: Chapman & Hall/CRC Press.

McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.

Simonoff, J. S. (2003) Analyzing Categorical Data, New York: Springer-Verlag.

Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.

Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.

See Also

acat, cratio, sratio, multinomial, pneumo, logit, probit, cloglog, cauchit, golf, polf, nbolf.

Examples

# Fit the proportional odds model, p.179, in McCullagh and Nelder (1989)
data(pneumo)
pneumo = transform(pneumo, let=log(exposure.time))
(fit = vglm(cbind(normal, mild, severe) ~ let,
            cumulative(parallel=TRUE, reverse=TRUE), pneumo))
fit@y   # Sample proportions
weights(fit, type="prior")   # Number of observations
coef(fit, matrix=TRUE)
constraints(fit)   # Constraint matrices

# Check that the model is linear in let
fit2 = vgam(cbind(normal, mild, severe) ~ s(let, df=2),
            cumulative(reverse=TRUE), pneumo)
## Not run: 
plot(fit2, se=TRUE, overlay=TRUE, lcol=1:2, scol=1:2)
## End(Not run)

# Check the proportional odds assumption with a likelihood ratio test
(fit3 = vglm(cbind(normal, mild, severe) ~ let,
             cumulative(parallel=FALSE, reverse=TRUE), pneumo))
1 - pchisq(2*(logLik(fit3)-logLik(fit)),
           df=length(coef(fit3))-length(coef(fit)))

[Package VGAM version 0.7-4 Index]