/
Home |

S-Archive | Download Script |

tariff |
Estimate insurance
tariffs |

**DESCRIPTION**- Estimate a mean and dispersion model for the cost and frequency of insurance claims.
Allows the estimation of insurance tariffs. Produces a double generalized linear model
object of class "dglm" which inherits from "glm" and "lm".

**Note:**To use this function, you will also need to the functions associated with dglm and the Tweedie family. **USAGE**`tariff <- function(formula = formula(data), dformula = ~1, nclaims = NULL, exposure = NULL, link.power = 0, dlink.power = 0, var.power = 1.5, data = sys.parent(), subset = NULL, contrasts = NULL, method = "ml", mustart = NULL, betastart = NULL, phistart = NULL, control = dglm.control(...), ykeep = T, xkeep = F, zkeep = F, ...)`**REQUIRED ARGUMENTS**`formula`a formula expression as for `glm`, of the form`response ~ predictors`. See the documentation of`lm`and`formula`for details. As for`glm`, this specifies the linear predictor for modelling the mean. A term of the form`offset(expression)`is allowed. The response should be the total cost of claims divided by the number of claims.**OPTIONAL ARGUMENTS**`dformula`a formula expression of the form `~ predictor`, the response being ignored. This specifies the linear predictor for modelling the dispersion. A term of the form`offset(expression)`is allowed. For insurance modelling, this will often be the same as the mean model.`nclaims`vector giving the number of claims. `exposure`vector giving a measure of exposure to risk, usually proportional to policy years. `link.power`link function for modelling the mean. A linear predictor is used for the mean raised to link.power, with 0 indicating the log-link. `dlink.power`link function for modelling the dispersion. A linear predictor is used for the dispersion raised to link.power, with 0 indicating the log-link. `var.power`Scalar. The variance is assumed proportion to the mean raised to this power. Must be between 1 and 2. `data`as for the glm function; see S-Plus documentation. `subset`as for the glm function; see S-Plus documentation. `contrasts`as for the glm function; see S-Plus documentation. `method`the method used to estimate the dispersion parameters; the default is "ml" for maximum likelihood and the alternative is "reml" for restricted maximum likelihood. Upper case and partial matches are allowed. `mustart`numeric vector giving starting values for the fitted values or expected responses. Must be of the same length as the response, or of length 1 if a constant starting vector is desired. Ignored if `betastart`is supplied.`betastart`numeric vector giving starting values for the regression coefficients in the link-linear model for the mean. `phistart`numeric vector giving starting values for the dispersion parameters. `control`a list of iteration and algorithmic constants. See `dglm.control`for their names and default values. These can also be set as arguments to`tariff`itself.`ykeep`logical flag: if `TRUE`, the vector of responses is returned.`xkeep`logical flag: if `TRUE`, the`model.matrix`for the mean model is returned.`zkeep`logical flag: if `TRUE`, the`model.matrix`for the dispersion model is returned.**VALUE**- an object of class
`dglm`is returned, which inherits from`glm`and`lm`. See`dglm.object`for details. **DETAILS**- Let
*z*be the total cost of claims in the_{i}*i*th category, and let*n*_{i}be the numbe of claims. We assume that the*n*_{i}are Poisson and that the size of each claim follows a gamma distribution. This implies that the average observed claim size*y*=_{i}*z*/_{i}*n*_{i}follows Tweedie's compound Poisson distribution. The function tariff computes maximum likelihood or restricted maximum likelihood estimators for the parameters based on the joint likelihood of*y*and_{i}*n*_{i}.

The function is similar in structure to the double generalized linear model function dglm, and it returns an object of the same class. **REFERENCES**- Smyth, G. K., and Verbyla, A. P. (1999). Adjusted likelihood methods for modelling
dispersion in generalized linear models.
*Environmetrics***10**, 696-709. Read article

Smyth, G. K., and Jørgensen, B. (To appear). Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling.*ASTIN Bulletin*. Read article **SEE ALSO**- dglm, dglm.object, Tweedie family.
**WARNING**- The anova method is questionable when applied to an dglm object with method="reml" (stick to "ml").
**EXAMPLES**- Estimate tariffs for the Swedish 3rd party motor insurance data. This reproduces results from Smyth and Jørgensen (in press).
motorins <- read.table("c:/gordon/www/data/general/motorins.txt",header=T) motorins <- motorins[motorins$Zone == 1 & motorins$Make != 9,] motorins$Bonus <- factor(motorins$Bonus) motorins$Make <- factor(motorins$Make) motorins$Kilometres <- factor(motorins$Kilometres) contrasts(motorins$Bonus) <- contr.treatment(levels(motorins$Bonus)) contrasts(motorins$Make) <- contr.treatment(levels(motorins$Make)) contrasts(motorins$Kilometres) <- contr.treatment(levels(motorins$Kilometres)) attach(motorins) out <- tariff(Payment/Insured~Bonus+Make+Kilometres,~Bonus+Make+Kilometres,nclaims=Claims,exposure=Insured,var.power=1.72) summary(out) # Base risk tapply(fitted(out),list(Bonus,Make,Kilometres),mean)[1,1,1] # Multiplative tariff factors for other factor levels exp(coef(out))

S-Archive | Download Script |

Gordon Smyth.
Copyright © 1996-2016. *Last modified:
10 February 2004*