[R] general question about dropping terms of glm model fits

Frank Harrell f.harrell at vanderbilt.edu
Fri Mar 18 23:37:18 CET 2011


It will distort statistical inference to drop any terms on the basis of
P-values, AIC, etc..

If you drop terms, use the hierarchy principle.

High correlations between variables don't necessarily invalidate a test.
Frank


Sacha Viquerat-2 wrote:
> 
> hello dear list!
> as I am currently helping someone with their statistical analysis of a 
> count survey, I stumbled upon a very basic question upon model
> optimization:
> 
> when fitting a model like:
> 
> model<-lmer(abundance~(x+y+z)^3,family=poisson,...)
> 
> in which x,y,z are continuous abiotic parameters such as po4 
> concentration, no2-concentration, which terms / interaction terms would 
> you recommend removing FIRST?
> 
> the ones of lowest significance (i.e. the ones with highest p-value) OR
> 
> the ones with the most complex interaction structure (even though 
> p-values may be low-ish)?
> 
> another question just popped in my mind:
> 
> let's say I've reduced my model to significant terms:
> 
> y ~ temperature + po4 + po4:temperature
> 
> and I know that correlation between po4 and temperature is high. would 
> you say that this is reason enough to remove the interaction term?
> 
> any opinion is a welcome opinion!
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/general-question-about-dropping-terms-of-glm-model-fits-tp3387085p3388629.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list