[Rd] glm offset and interaction bugs (PR#4941)

charlie at stat.umn.edu charlie at stat.umn.edu
Tue Nov 4 02:14:00 MET 2003


Full_Name: Charles J. Geyer
Version: 1.8.0
OS: i686-pc-linux-gnu (Suse 8.2)
Submission from: (NULL) (134.84.86.22)


Two bugs (perhaps related, perhaps independent) revealed by the same
Poisson regression with offset

mydata <- read.table(url("http://www.stat.umn.edu/geyer/5931/mle/seeds.txt"))
out.fubar <- glm(seedlings ~ burn01 + vegtype * burn02 +
    offset(log(totalseeds)), data = mydata, family = poisson)
summary(out.fubar)
out.barfu <- glm(seedlings ~ burn01 + vegtype * burn02,
    offset = log(totalseeds), data = mydata, family = poisson)
summary(out.barfu)
out.ok <- glm(seedlings ~ vegtype * burn02 + burn01,
    offset = log(totalseeds), data = mydata, family = poisson)
summary(out.ok)

As far as I can tell from reading the documentation, these should produce
the same results.  They don't.  The regression coefficient for the
offset term in the first (fubar) regression is bogus.  That's not what
offset() is supposed to do.  Note that offset() works properly in

out <- glm(seedlings ~ vegtype + burn01 + burn02 + offset(log(totalseeds)),
    data = mydata, family = poisson)
summary(out)

So is is only partially bogus -- very dangerous for users that are less
than hyperalert.

The difference between out.barfu and out.ok shows that "+" in formulas
is noncommutative, which is very mind bending.

The regression in out.ok is o. k.  It checks by hand.

For a more complete explanation (if more is wanted), including
the printout from these summary commands on my machine and the
check of out.ok "by hand", see

   http://www.stat.umn.edu/geyer/5931/mle/seed2.Rnw
   http://www.stat.umn.edu/geyer/5931/mle/seed2.pdf



More information about the R-devel mailing list