Gordon Smyth Home: Research: Publications


Smyth, G. K., and Verbyla, A. P. (1999). Adjusted likelihood methods for modelling dispersion in generalized linear models. Environmetrics 10, 695-709.

Adjusted Likelihood Methods for Modelling Dispersion in Generalized Linear Models

Gordon K. Smyth, Department of Mathematics, University of Queensland
and Arünas P. Verbyla, Department of Statistics, University of Adelaide

Abstract

This paper considers double generalized linear models, which allow the mean and dispersion to be modelled simultaneously in a generalized linear model context. Estimation of the dispersion parameters is based on a c21 approximation to the unit deviances, and the accuracy of the saddle-point approximation which underlies this is discussed. Approximate REML methods are developed for estimation of the dispersion, and these are related to the likelihood adjustment methods of McCullagh and Tibshirani (1990) and Cox and Reid (1987). The approximate REML methods can be implemented with very little added complication in a generalized linear model setting by adjusting the working vector and working weights. S-Plus functions for double generalized linear models are described. Through two data examples it is shown that the approximate REML methods are more robust than maximum likelihood, in the sense of being less sensitive to perturbations in the mean model.

Keywords: dispersion modelling; REML; generalized linear models; slippage models; adjusted profile likelihood.

Introduction

Generalized linear models allow us to model responses which are not normally distributed, using methods closely analogous to normal linear methods for normal data (McCullagh and Nelder, 1989). They are more general than normal linear methods in that a mean-variance relationship appropriate for the data can be accommodated and in that an appropriate scale can be chosen for modelling the mean on which the action of the covariates is approximately linear. On the other hand, once the mean-variance relationship is specified, the variance is assumed known up to a constant of proportionality, the dispersion parameter. While generalized linear models continue to be extremely useful, the complexities often encountered in observed data and the possibilities opened by modern computing power ensure that there is a strong need now for even more flexible models. Modern requirements are for models which include random effects, non-parametric trends and non-homogenous dispersion. A comprehensive attack on many real problems in biomedical or environmental research would involve an integration of these and other components. In this paper we concentrate on non-homogeneous dispersion and the modelling of dispersion in terms in covariates.

It is well known that efficient estimation of mean parameters in regression depend on correct modelling of the dispersion. The loss of efficiency in using constant dispersion models when the dispersion is varying may be substantial. Modelling of the dispersion is also necessary to obtain correct standard errors and confidence intervals, as well as for many other applications such as prediction, estimation of detection limits or immunoassay (Carroll, 1987; Carroll and Rupert, 1988). In many studies, modelling the dispersion will be of direct interest in its own right, to identify the sources of variability in the observations.

Many authors have considered dispersion modelling for normal data, for example Aitkin (1987), Carroll (1987), Davidian and Carroll (1987), Carroll and Rupert, (1988). Smyth (1989) showed that similar methods could be used for a certain class of non-normal generalized linear models. In this paper we extend Smyth's (1989) methods to arbitrary generalized linear models by using the saddle-point approximation to the distribution of the responses.

Before dispersion modelling can take place, it is necessary to estimate the mean of the data accurately. For this reason, dispersion modelling takes place in the presence of a (possibly large) number of nuisance parameters. It is well known that maximum likelihood estimators for variance parameters in regression models are generally biased. For normal linear models it is common to use residual or restricted maximum likelihood (REML) instead of maximum likelihood to estimate parameters affecting the variances. REML maximizes the likelihood, not of the original observations, but of a set of zero mean contrasts. This has the effect of adjusting for available degrees of freedom, and produces estimators which are at least approximately unbiased.

The generalization of REML to non-normal models is not obvious, as zero mean contrasts do not generally exist. Several general methods of adjusting likelihood methods for nuisance parameters have been proposed, including Cox and Reid (1987), McCullagh and Tibshirani (1990) and Smyth and Verbyla (1996), which reduce to REML for normal linear models. In this paper we use the approach of McCullagh and Tibshirani (1990) to adjust the score vector and information matrices for leverage effects. We find that this requires minimal modification to the standard computations in a generalized linear model context. We note that the adjustments agree with Cox and Reid (1987) to second order, but not with the saddle-point conditional likelihood given by Smyth and Verbyla (1996).

Verbyla (1993) shows that REML estimators in normal linear regression enjoy a hitherto unappreciated robustness property, of being less sensitive than the maximum likelihood estimators to perturbations in the model. This property supports the notion that REML can be considered more reliable than maximum likelihood in small samples. We show, through two data examples, that our adjusted likelihood methods also enjoy this property in this more general context.

Section 2 of this paper introduces double generalized linear models, in which the mean and the dispersion are modelled simultaneously. The saddle-point approximation and its accuracy is discusses in Section 3. Section 4 discusses generalizations of REML to non-normal models. The application to double generalized linear models is set out in Section 5, and two data examples are worked through in Section 6. S-Plus functions to fit double generalized linear models are also described. The paper finishes with a summary and pointers to software availability.

Download