[R] How do I compare 47 GLM models with 1 to 5 interactions and unique combinations?

Milan Bouchet-Valat nalimilan at club.fr
Wed Jan 25 10:32:44 CET 2012


Le mardi 24 janvier 2012 à 20:41 -0800, Jhope a écrit :
> Hi R-listers,
> 
> I have developed 47 GLM models with different combinations of interactions
> from 1 variable to 5 variables. I have manually made each model separately
> and put them into individual tables (organized by the number of variables)
> showing the AIC score. I want to compare all of these models. 
> 
> 1) What is the best way to compare various models with unique combinations
> and different number of variables? 
See ?step or ?stepAIC (from package MASS) if you want an automated way
of doing this.

> 2) I am trying to develop the most simplest model ideally. Even though
> adding another variable would lower the AIC, how do I interpret it is worth
> it to include another variable in the model? How do I know when to stop? 
This is a general statistical question, not specific to R. As a general
rule, if adding a variable lowers the AIC by a significant margin, then
it's worth including it. You should only stop when a variable increases
the AIC. But this is assuming you consider it a good indicator and you
know what you're doing. There's plenty of literature on this subject.

> Definitions of Variables:
> HTL - distance to high tide line (continuous)
> Veg - distance to vegetation 
> Aeventexhumed - Event of exhumation
> Sector - number measurements along the beach
> Rayos - major sections of beach (grouped sectors)
> TotalEggs - nest egg density
> 
> Example of how all models were created: 
> Model2.glm <- glm(cbind(Shells, TotalEggs-Shells) ~ Aeventexhumed,
> data=data.to.analyze, family=binomial)
> Model7.glm <- glm(cbind(Shells, TotalEggs-Shells) ~ HTL:Veg, family =
> binomial, data.to.analyze)
> Model21.glm <- glm(cbind(Shells, TotalEggs-Shells) ~ HTL:Veg:TotalEggs,
> data.to.analyze, family = binomial)
> Model37.glm <- glm(cbind(Shells, TotalEggs-Shells) ~
> HTL:Veg:TotalEggs:Aeventexhumed, data.to.analyze, family=binomial)
To extract the AICs of all these models, it's easier to put them in a
list and get their AICs like this:
m <- list()
m$model2 <- glm(cbind(Shells, TotalEggs-Shells) ~ Aeventexhumed,
data=data.to.analyze, family=binomial)
m$model3 <- glm(cbind(Shells, TotalEggs-Shells) ~ HTL:Veg, family =
binomial, data.to.analyze)

sapply(m, extractAIC)


Cheers



More information about the R-help mailing list