[R] GLM model selection

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jun 29 14:05:45 CEST 2001


On Fri, 29 Jun 2001, Antonio Olinto wrote:

> Dear R list members,
>
> I'd like to know whether the AIC statistic used by step( ) in R is suitable
for selecting GLM models with gamma error distribution and log link function.

Look at

> extractAIC.glm
function (fit, scale = 0, k = 2, ...)
{
    n <- length(fit$residuals)
    edf <- n - fit$df.residual
    aic <- fit$aic
    c(edf, aic + (k - 2) * edf)
}

which uses the aic value from

> Gamma(log)$aic
function (y, n, mu, wt, dev)
{
    n <- sum(wt)
    disp <- dev/n
    -2 * sum(dgamma(y, 1/disp, mu * disp, log = TRUE) * wt) +
        2
}

That's not the correct AIC with unknown scale, since the scale estimate is
not the MLE.

> A statistician friend of mine (S plus user) said me to take care because in
gamma distribution phi (scale?) is not constant. Also in step( ) help page is
written that "there is a potential problem in using glm fits with a variable
scale as in that case the deviance is not simply related to the maximized
log-likelihood".
>
> If it's not the most appropriated way to select the model, which would be the
best way to perform the selection?

You could rewrite the AIC to use the MLE for the scale and the correct
formulae.


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list