[R] model simplification using Crawley as a guide
P.Dalgaard at biostat.ku.dk
Wed Jun 11 14:14:19 CEST 2008
> I have consciously avoided using step() for model simplification in favour
> of manually updating the model by removing non-significant terms one at a
> time. I'm using The R Book by M.J. Crawley as a guide. It comes as no
> surprise that my analysis does proceed as smoothly as does Crawley's and
> being a beginner, I'm struggling with what to do next.
> I have a model:
> lm(y~A * B * C)
> where A is a categorical variable with three levels and B and C are
> continuous covariates.
> Following Crawley, I execute the model, then use summary.aov() to identify
> non-significant terms. I begin deleting non-significant interaction terms
> one at a time (using update). After each update() statement, I use
> anova(modelOld,modelNew) to contrast the previous model with the updated
> one. After removing all the interaction terms, I'm left with:
> lm(y~ A + B + C)
> again, using summary.aov() I identify A to be non-significant, so I remove
> it, leaving:
> lm(y~B + C) both of which are continuous variables
> Does it still make sense to use summary.aov() or should I use summary.lm()
> instead? Has the analysis switched from an ANCOVA to a regression? Both
> give different results so I'm uncertain which summary to accept.
> Any help would be appreciated!
Does he really recommend using summary.aov() on an lm object??? I
wouldn't. It _might_ give sensible results, but in general, aov() and
its methods rely on balancedness and orthogonality properties of the
design, to the extent that I'm inclined to say that if you do not know
exactly what is going on, it is probably the wrong thing.
I'd use drop1 throughout.
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help