[R] model simplification using Crawley as a guide

Frank E Harrell Jr f.harrell at vanderbilt.edu
Wed Jun 11 13:42:10 CEST 2008


ChCh wrote:
> Hello,
> 
> I have consciously avoided using step() for model simplification in favour
> of manually updating the model by removing non-significant terms one at a
> time.  I'm using The R Book by M.J. Crawley as a guide. It comes as no
> surprise that my analysis does proceed as smoothly as does Crawley's and
> being a beginner, I'm struggling with what to do next.  
> 
> I have a model:
> 
> lm(y~A * B * C)
> 
> where A is a categorical variable with three levels and B and C are
> continuous covariates.
> 
> Following Crawley, I execute the model, then use summary.aov() to identify
> non-significant terms.  I begin deleting non-significant interaction terms
> one at a time (using update).  After each update() statement, I use
> anova(modelOld,modelNew) to contrast the previous model with the updated
> one.  After removing all the interaction terms, I'm left with:
> 
> lm(y~ A + B + C)
> 
> again, using summary.aov() I identify A to be non-significant, so I remove
> it, leaving:
> 
> lm(y~B + C) both of which are continuous variables
> 
> Does it still make sense to use summary.aov() or should I use summary.lm()
> instead?  Has the analysis switched from an ANCOVA to a regression?  Both
> give different results so I'm uncertain which summary to accept.
> 
> Any help would be appreciated!
> 
> 

What is the theoretical basis for removing insignificant terms?  How 
will you compensate for this in the final analysis (e.g., how do you 
unbias your estimate of sigma squared)?

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list