[R] An AIC model selection question

Ben Bolker bolker at ufl.edu
Thu Oct 2 15:13:35 CEST 2008


Uwe Ligges <ligges <at> statistik.tu-dortmund.de> writes:

> 
> 
> Christoph Scherber wrote:
> > Dear R users,
> > 
> > Assume I have three models with the following AIC values:
> > 
> > model    AIC    df
> > model1     -10    2
> > model2    -12    5
> > model3    -11    2
> > 
> > Obviously, model2 would be preferred, but it "wastes" 5 df compared to 
> > the other models.
> > 
> > Would it be allowed to select model3 instead, simply because it uses up 
> > less df and the delta-AIC between model2 and model3 is just 1?
> 
> Well, on the one hand, the degrees of freedom are already part of the 
> AIC calculation. So if you say you really want to apply model selection 
> based on AIC, the answer is `no'.
> On the other hand, AIC is just one possible way to penalize the 
> Likelihood values by the used number of degrees of freedom. You can 
> choose some different criterion, if you think the amount of penalization 
> in AIC is too weak for you.
> 
> Best,
> Uwe Ligges

   (Actually, it looks like model2 only wastes 3 df.)
AIC is telling you that those 3 df are worth it in
terms of improved model fit (and expected predictive
capability). Given that these models are (as you say)
all more or less equivalent in terms of expected
predictive ability, other factors (biological realism,
etc.) *might* come into play. However, you should
be very careful and very explicit when you start
introducing other factors. Especially if the models
tell different stories or have different predictions,
you should consider model-averaging or refusing to
select a single model to describe the data.

  To speak to Uwe's comment, if your data sets are
small you might want to consider "corrected" AIC
(which increases the penalty term for complexity).

  Ben Bolker



More information about the R-help mailing list