/
Home |

S-Archive | Download Script |

reglm |
Generalized linear
models with random effects |

**DESCRIPTION**- Fits a generalized linear model including random factors using the method of Schall (1991). Produces an object of class "glm" with some additional components.
**USAGE**`reglm(formula = formula(data), random = NULL, family = gaussian(), data = sys.parent(), weights = NULL, dispersion = NULL, subset = NULL, na.action = na.fail, contrasts = NULL, start = NULL, mustart = NULL, betastart = NULL, sigmastart = NULL, control = reglm.control(...), ykeep = T, xkeep = F, ukeep = F, ...)`**REQUIRED ARGUMENTS**`formula`a formula expression as for `glm`, of the form`response ~ predictors`. This specifies the fixed part of the model. See the documentation of`lm`and`formula`for details. As for`glm`, this specifies the linear predictor for modelling the mean. A term of the form`offset(expression)`is allowed.**OPTIONAL ARGUMENTS**- Almost all arguments for
`reglm`are as for`glm`. The following arguments are new or altered. The first is the important one. The others are likely to be of only occasional use.`random`a list object, the components of which are the factors to be treated as random. This specifies the random part of the model. `dispersion`specifies a prior value for the dispersion of the glm. This argument defaults to 1 for the binomial and Poisson families. Same as the $scale command in GLIM. `mustart`numeric vector giving starting values for the fitted values or expected responses (including the predicted values of the random effects). This is the same as the usual glm argument `start`except on the scale of the means rather than of the linear predictors. Must be of the same length as the response, or of length 1 if a constant starting vector is desired. Ignored if`betastart`is supplied.`betastart`numeric vector giving starting values for the regression coefficients in the link-linear model for the mean (including predicted values of the random effects). `sigmastart`numeric vector giving starting values for the variance components `control`a list of iteration and algorithmic constants. See `reglm.control`for their names and default values. These can also be set as arguments to`dglm`itself. This argument is the same as control.glm except that the defaults have been changed.`ykeep`logical flag: if `TRUE`, the vector of responses is returned. Same as`y`for glm.`xkeep`logical flag: if `TRUE`, the`model.matrix`for the mean model is returned. Same as`x`for glm.`zkeep`logical flag: if `TRUE`, the`model.matrix`for the variance components is returned. **VALUE**- an object of class
`glm`is returned. The following components are additional to the usual glm components:`sigma`vector of estimated variance components. `dispersion`estimated dispersion. If the family had a prior dispersion, then the dispersion is estimated only at convergence of the iteration and returned to diagnose over- or under-dispersion. `prior.dispersion`prior value of the dispersion for the glm. Same as the input argument dispersion if that was specified. `q`vector of dimensions of the random factors.. `edf`vector of effective degrees of freedom used to estimate the variance components. `u`design matrix for the random factors, if ukeep=T on input. **DETAILS**- We assume the linear predictor for the generalized linear model is of the form
*g*(**m**) =*X***b***+ U*_{1}**b**_{1}+ ... + U_{c}**b**_{c}where

**b**is a vector of fixed coefficients andare vectors of random effects. The random effects are assumed to have variances s**b**_{1}...**b**_{c}s_{1}.... The design matrix_{c}*X*is specified using the model formula and the design matrices*U*are specified using the input argument_{1}... U_{c}`random`.

The parameters are estimated here using the REML method described in Schall (1991). **REFERENCES**- Schall, R (1991). Estimation in generalized linear models with random effects.
*Biometrika*,**78**, 719-727.

Candy, S. G. (2004). Modelling catch and effort data using generalized linear models, the Tweedie distribution, random vessel effects and random stratum-by-year effects.*CCAMLR Science***11**, 59-80. **ACKNOWLEDGEMENTS**- Thanks for Dr Steve Candy, Australian Antarctic Division, for contributing a bug fix, 6 Nov 2004.
**SEE ALSO**- dglm
- The following program is included in the
`reglm`software distribution, but is not separately documented:`reglm.control`. **BUGS**- On output the regression parameters don't have labels.
**EXAMPLES**- Here we demonstrate the reglm function on the radiation of cancer cells data
example from Schall (1991).
Basic analysis of the data shows that there is strong evidence for differences between the occasions, and also good evidence for overdispersion between the dishes (observations) within occasions.
> radiatio <- read.table("radatio.txt",header=T) > attach(radiatio) > n <- rep(400,27)

We specify a binomial model with two random effects. The factor Occasion represents the radiation occasion. There is also a random effect at the observation level, represented by (1:27) since there are 27 observations, which picks up any over-dispersion in the data. Trace is turned on so that we can see the progress of the iteration. The variance components and dispersion are output at each iteration.> out <- reglm(Survived/n~1,list(Occasion=Occasion,Units=(1:27)), family=binomial,weights=n,trace=T) Iter 1 0.0000505604239786021 0.0000166989428102248 1 Iter 2 0.00283092775872843 0.000306570522656396 1 Iter 3 0.0914639350937484 0.00247127680331739 1 Iter 4 0.207417308402352 0.00381847728525275 1 Iter 5 0.222886503891888 0.00539847167615311 1 Iter 6 0.223151486991139 0.00690093186214274 1 Iter 7 0.222683195620167 0.00805926586181473 1 Iter 8 0.222303061660932 0.00882599204674674 1 Iter 9 0.222049310739124 0.00928435931728272 1 Iter 10 0.22189665203915 0.0095420092039735 1 Iter 11 0.221810447442714 0.00968186587710608 1 Iter 12 0.221763517700708 0.0097563502955682 1 Iter 13 0.221738481620298 0.00979561756518506 1 Iter 14 0.221725270519245 0.00981620792679515 1 Iter 15 0.221718339587564 0.00982697440004381 1 Iter 16 0.221714714509054 0.00983259577809794 1 Iter 17 0.221712821520788 0.00983552854676926 1 > out$sigma [1] 0.221712822 0.009835529 > out$dispersion [1] 0.9369549

The final variance components, and residual variance estimate, are the same as in Schall (1991). Note that the prior dispersion in this case is 1.> out$q [1] 9 27 > out$edf [1] 7.744181 8.232174

The random effects have dimensions 9 and 27 respectively. After smoothing the random effects, the effective degrees of freedom are 7.7 and 8.2 respectively. Including Occasion as a fixed effect would have used 8 df. In this case the effect of Occasion is so strong that there is very little difference between the fixed and random models for it.> out$rank-sum(out$q)+sum(out$edf) [1] 16.97635

out$rank gives the total number of parameters in the linear predictor including both the fixed and random parameters. Therefore out$rank-sum(out$q) gives the number of fixed parameters, in this case 1. The total effective degrees of freedom used in modelling the means is just under 17.0, including the 1 degree of freedom for the fixed model.> coef(out)[2:10] [1] 0.69126760 -0.18631757 -0.70870278 -0.01694652 0.19448300 -0.02442938 [7] 0.60688086 -0.54311626 -0.01311896 > contrasts(Occasion) <- contr.sum(levels(Occasion)) > out.fixed <- glm(Survived/n~Occasion,family=binomial,weights=n) > coef(out.fixed)[2:9] Occasion1 Occasion2 Occasion3 Occasion4 Occasion5 Occasion6 0.7128812 -0.1915751 -0.7354824 -0.01624658 0.2027526 -0.02395969 Occasion7 Occasion8 0.6260501 -0.5620218

Note that the estimated coefficients for Occasion (there are 9 of them and they add to zero) are similar to the equivalent parameters from the fixed model, but shrunk somewhat towards zero. In the fixed model, the last coefficient is not included explicitly because the coefficients are constrained to add to zero.

S-Archive | Download Script |

Gordon Smyth.
Copyright © 1996-2016. *Last modified:
10 February 2004*