[Rd] glm offset and interaction bugs (PR#4941)
Thomas Lumley
tlumley at u.washington.edu
Tue Nov 4 04:39:26 MET 2003
On Tue, 4 Nov 2003 charlie at stat.umn.edu wrote:
> Full_Name: Charles J. Geyer
> Version: 1.8.0
> OS: i686-pc-linux-gnu (Suse 8.2)
> Submission from: (NULL) (134.84.86.22)
>
>
> Two bugs (perhaps related, perhaps independent) revealed by the same
> Poisson regression with offset
Only one bug, i think. There is certainly a bug in handling offset() in
formulas with interactions (which is presumably fairly recent, since the C
code for offsets dates only from January this year).
The latter two models, though, are the same AFAICS, or more precisely are
reparametrisations of the same underidentified model. The only way to make
+ commutative in this example would be to have some arbitrary rule based
on the name of the variable for which parameters to drop, and that would
have the even more unintuitive effect that changing the name of a variable
would affect the output.
> mydata <- read.table(url("http://www.stat.umn.edu/geyer/5931/mle/seeds.txt"))
> out.fubar <- glm(seedlings ~ burn01 + vegtype * burn02 +
> offset(log(totalseeds)), data = mydata, family = poisson)
> summary(out.fubar)
> out.barfu <- glm(seedlings ~ burn01 + vegtype * burn02,
> offset = log(totalseeds), data = mydata, family = poisson)
> summary(out.barfu)
> out.ok <- glm(seedlings ~ vegtype * burn02 + burn01,
> offset = log(totalseeds), data = mydata, family = poisson)
> summary(out.ok)
>
> As far as I can tell from reading the documentation, these should produce
> the same results. They don't. The regression coefficient for the
> offset term in the first (fubar) regression is bogus. That's not what
> offset() is supposed to do. Note that offset() works properly in
>
> out <- glm(seedlings ~ vegtype + burn01 + burn02 + offset(log(totalseeds)),
> data = mydata, family = poisson)
> summary(out)
>
> So is is only partially bogus -- very dangerous for users that are less
> than hyperalert.
>
> The difference between out.barfu and out.ok shows that "+" in formulas
> is noncommutative, which is very mind bending.
>
> The regression in out.ok is o. k. It checks by hand.
>
> For a more complete explanation (if more is wanted), including
> the printout from these summary commands on my machine and the
> check of out.ok "by hand", see
>
> http://www.stat.umn.edu/geyer/5931/mle/seed2.Rnw
> http://www.stat.umn.edu/geyer/5931/mle/seed2.pdf
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-devel
mailing list