/
Home |

S-Archive | Download Script |

dglm |
Double generalized
linear models |

**DESCRIPTION**- Fits a generalized linear model with a link-linear model for the dispersion as well as
for the mean. Produces an object of class "dglm" which inherits from
"glm" and "lm".

**Note:**for this routine to work correctly with gamma responses you will need also the functions associated with the Digamma family and the polygamma functions. If you wish to use general power variance functions you will also need to the functions associated Tweedie family. **USAGE**`function(formula, dformula = ~ 1, family = gaussian(), dlink = "log", data = sys.parent(), subset = NULL, weights = NULL, contrasts = NULL, method = "ml", mustart = NULL, betastart = NULL, phistart = NULL, control = dglm.control(...), ykeep = T, xkeep = F, zkeep = F, ...)`**REQUIRED ARGUMENTS**`formula`a formula expression as for `glm`, of the form`response ~ predictors`. See the documentation of`lm`and`formula`for details. As for`glm`, this specifies the linear predictor for modelling the mean. A term of the form`offset(expression)`is allowed.**OPTIONAL ARGUMENTS**- Most arguments for
`dglm`are as for`glm`. The following arguments are new or altered:`dformula`a formula expression of the form `~ predictor`, the response being ignored. This specifies the linear predictor for modelling the dispersion. A term of the form`offset(expression)`is allowed.`dlink`link function for modelling the dispersion. Any link function accepted by the `quasi`family is allowed, including`power(x)`. See details below.`method`the method used to estimate the dispersion parameters; the default is "reml" for restricted maximum likelihood and the alternative is "ml" for maximum likelihood. Upper case and partial matches are allowed. `mustart`numeric vector giving starting values for the fitted values or expected responses. Must be of the same length as the response, or of length 1 if a constant starting vector is desired. Ignored if `betastart`is supplied.`betastart`numeric vector giving starting values for the regression coefficients in the link-linear model for the mean. `phistart`numeric vector giving starting values for the dispersion parameters. `control`a list of iteration and algorithmic constants. See `dglm.control`for their names and default values. These can also be set as arguments to`dglm`itself.`ykeep`logical flag: if `TRUE`, the vector of responses is returned.`xkeep`logical flag: if `TRUE`, the`model.matrix`for the mean model is returned.`zkeep`logical flag: if `TRUE`, the`model.matrix`for the dispersion model is returned. **VALUE**- an object of class
`dglm`is returned, which inherits from`glm`and`lm`. See`dglm.object`for details. **DETAILS**- Write
*m*= E(_{i}*y*) for the expectation of the_{i}*i*th response. Then Var(*y*) =_{i}*s*(_{i}V*m*) where_{i}*V*is the variance function and*s*is the dispersion of the_{i}*i*th response (often denoted as the Greek character phi). We assume the link linear models*g*(*m*) =_{i}**x**_{i}^{T},*b**h*(*s*) =_{i}**z**_{i}^{T},*a*where

and**x**_{i}are vectors of covariates, and**z**_{i}and*b*are vectors of regression cofficients affecting the mean and dispersion respectively. The argument*a*`dlink`specifies*h*. See`family`in the S-Plus help for how to specify*g*. The optional arguments`mustart`,`betastart`and`phistart`specify starting values for*m*,_{i}and*b**s*respectively._{i}The parameters

are estimated as for an ordinary glm. The parameters**b**are estimated by way of a dual glm in which the deviance components of the ordinary glm appear as responses. The estimation procedure alternates between one iteration for the mean submodel and one iteration for the dispersion submodel until overall convergence.**a**The output from

`dglm`,`out`say, consists of two`glm`objects (that for the dispersion submodel is`out$dispersion.fit`) with a few more components for the outer iteration and overall likelihood. The`summary`and`anova`functions have special methods for`dglm`objects. Any generic function which has methods for glms or lms will work on`out`, giving information about the mean submodel. Information about the dispersion submodel can be obtained by using`out$dispersion.fit`as argument rather than`out`itself. In particular`drop1(out,scale=1)`gives correct score statistics for removing terms from the mean submodel, while`drop1(out$dispersion.fit,scale=2)`gives correct score statistics for removing terms from the dispersion submodel. - The dispersion submodel is treated as a
`gamma`family unless the original reponses are gamma, in which case the dispersion submodel is`digamma`. (Note that the`digamma`and`trigamma`functions are required to fit a`digamma`family.) This is exact if the original glm family is`gaussian`,`Gamma`or`inverse.gaussian`. In other cases it can be justified by the saddle-point approximation to the density of the responses. The results will therefore be close to exact ML or REML when the dispersions are small compared to the means. In all cases the dispersion submodel as prior weights 1, and has its own dispersion parameter which is 2. **REFERENCES**- Smyth, G. K. (1989). Generalized linear models with varying dispersion.
*J. R. Statist. Soc. B*,**51**, 47--60.

Smyth, G. K., and Verbyla, A. P. (1999). Adjusted likelihood methods for modelling dispersion in generalized linear models.*Environmetrics***10**, 696-709. (Abstract - Zipped PostScript)

Smyth, GK, and Verbyla, AP (2009). Leverage adjustments for dispersion modelling in generalized nonlinear models.*Australian and New Zealand Journal of Statistics***51**, 433-448. **SEE ALSO**- dglm.object, Digamma Family, Polygamma Functions
- The following programs are included in the
`dglm`software distribution, but are not separately documented:`dglm.control`,`summary.dglm`,`print.summary.dglm`,`anova.dglm`,`glm.constant`. **WARNING**- The anova method is questionable when applied to an dglm object with method="reml" (stick to "ml").
**EXAMPLES**`# Fit a Gamma double generalized linear model.`

# A log-linear model with covariates z is used for the dispersion.

out <- dglm(y~x,~z,family=Gamma(link=log))

summary(out)

anova(out)

# Summarize the mean model as for a glm

summary.glm(out)

# Summarize the dispersion model as for a glm

summary(out$dispersion.fit)

# Examine goodness of fit of dispersion model by plotting residuals

plot(fitted(out$dispersion.fit),residuals(out$dispersion.fit))

S-Archive | Download Script |

Gordon Smyth.
Copyright © 1996-2016. *Last modified:
10 February 2004*