[R] type III effect from glm()

Thu Feb 19 16:25:32 CET 2009

Dear Mark and Simon,

I assume from the variable names that siteall and district are factors and
that yrs is numeric. If that's the case, then the second model formula, ~
siteall + district + yrs:district, nests yrs within district, that is, will
fit a separate slope for years within each level of district -- what you'd
get by ~ siteall + district/years or ~ siteall + district + yrs %in%
district. This model is equivalent to ~ siteall + yrs*district, although
it's parametrized differently. To see what's happening, check
model.matrix(test1) and model.matrix(test2).

More generally, R avoids violating marginality. If you want "type-III"
tests, you could use the Anova() function in the car package, but if I
properly interpreted the meaning of the predictors, the "type-III" test for
the "main effect" of yrs is simply the test that the slope for yrs is 0 in
the first (reference) category of district, assuming that you're using the
default dummy-coded (contr.treatment) contrasts -- not generally a
particularly interesting hypothesis.

Regards,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
> Behalf Of markleeds at verizon.net
> Sent: February-19-09 6:23 AM
> To: Simon Pickett; r-help at r-project.org
> Subject: Re: [R] type III effect from glm()
> 
>   Hi Simon: In below , test1 spelled out is count ~ siteall + yrs +
> district + yrs:district so this is fine.
> 
> but in test2 , you have years interacting with district but not the main
> effect for years. this is against the rules of marginality so I still
> think there's a problem. I would wait for John or the other wizaRds to
> respond ( you know who you are )  because I don't feel particularly
> confident giving advice on this because I bang my head against it often
> also. Plus, I gotta go home because it's getting light out soon ( i'm in
> the US on the east coast ). Good luck.
> 
> 
> 
> 
> On Thu, Feb 19, 2009 at  6:10 AM, Simon Pickett wrote:
> 
> > Cheers Mark,
> >
> > I did originally think too, i.e. that not including the main effect
> > was the problem. However, the same thing happens when I include main
> > effects....
> >
> >
> > test1<-
>
glm(count~siteall+yrs*district,family=quasipoisson,weights=weight,data=m[x[[
i
> ]],])
> >
> > test2<-
>
glm(count~siteall+district+yrs:district,family=quasipoisson,weights=weight,d
a
> ta=m[x[[i]],])
> > anova(test1,test2,test="F")
> >
> > Model 1: count ~ siteall + yrs * district
> > Model 2: count ~ siteall + district + yrs:district
> >  Resid. Df Resid. Dev   Df Deviance F Pr(>F)
> > 1      1933      75665
> > 2      1933      75665    0        0
> >
> > Simon.
> >
> >
> >
> >
> > ----- Original Message ----- From: <markleeds at verizon.net>
> > To: "Simon Pickett" <simon.pickett at bto.org>
> > Sent: Thursday, February 19, 2009 10:50 AM
> > Subject: RE: [R] type III effect from glm()
> >
> >
> >>  Hi Simon: John Fox can say a lot more about below but I've been
> >> reading his book over and over recently and one thing he constantly
> >> stresses is marginality which he defines as always including the
> >> lower order term if you include it in a higher order term. So, I
> >> think below is problematic because you are including an interaction
> >> that includes the main effect but not including the main effect. This
> >> definitely causes problems when trying to interpret
> >> the anova table or the Anova table. That's as much as I can say. I
> >> highly recommed his text for this sort of thing and hopefully he will
> >> respond.
> >>
> >> Oh, my point is that if you want to check the effect of yrs, then I
> >> think you have to take it out of model 2 totally in order to
> >> interpret the anova ( or the Anova ) table.
> >>
> >> On Thu, Feb 19, 2009 at  5:38 AM, Simon Pickett wrote:
> >>
> >>> Hi all,
> >>>
> >>> This could be naivety/stupidity on my part rather than a problem
> >>> with model output, but here goes....
> >>>
> >>> I have fitted a fairly simple model
> >>>
> >>>
> >>> m1<-
>
glm(count~siteall+yrs+yrs:district,family=quasipoisson,weights=weight,data=m
[
> x[[i]],])
> >>>
> >>> I want to know if yrs (a continuous variable) has a significant
> >>> unique effect in the model, so I fit a simplified model with the
> >>> main effect ommitted...
> >>>
> >>>
> >>>
> >>> m2<-
>
glm(count~siteall+yrs:district,family=quasipoisson,weights=weight,data=m[x[[
i
> ]],])
> >>>
> >>> then compare models using anova()
> >>> anova(m1,m1b,test="F")
> >>>
> >>> Analysis of Deviance Table
> >>>
> >>> Model 1: count ~ siteall + yrs + yrs:district
> >>> Model 2: count ~ siteall + yrs:district
> >>>   Resid. Df Resid. Dev   Df Deviance F Pr(>F)
> >>> 1      1936      75913                       2      1936      75913
> >>> 0 0
> >>>>
> >>>
> >>> The d.f.'s are exactly the same, is this right? Can I only test the
> >>> significance of a main effect when it is not in an interaction?
> >>> Thanks in advance,
> >>>
> >>> Simon.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Dr. Simon Pickett
> >>> Research Ecologist
> >>> Land Use Department
> >>> Terrestrial Unit
> >>> British Trust for Ornithology
> >>> The Nunnery
> >>> Thetford
> >>> Norfolk
> >>> IP242PU
> >>> 01842750050
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.