[Rd] Inconsistency, possibly a bug? (PR#758)

Brett Presnell presnell@stat.ufl.edu
Wed, 6 Dec 2000 18:56:56 -0500 (EST)


In message <Pine.GSO.4.05.10012050747500.21711-100000@auk.stats> you write:
> On Tue, 5 Dec 2000 presnell@stat.ufl.edu wrote:
> > 
> > This happens because of the line
> > 
> >   mf$drop.unused.levels <- TRUE
> > 
> > in lm and the lack of same in glm.  I'm reporting this as a bug, but
> > perhaps the difference is intentional?
> 
> No, as the glm code is not protected from user error as the lm code is. I
> am not sure it is bug (it's not documented to happen) and probably there
> are several related occurrences in other modelling functions, but it would
> be a desirable addition and I will change it for glm (at least).

Seeing the same topic come up on the Snews list prompted me to revisit
this. I will preface by admitting that it is not an big issue one way
or the other, since the "problem" is easy enough to work around
(although the poster on the Snews list apparently felt otherwise).

It is handy to fit a model to a data subset selected by the levels of
a factor z, and then to use the fit for prediction with new data which
of course does not include levels of z other than those used in the
fit (that's exactly what I was doing when I noticed the inconsistency
between lm and glm).

However, this is not always quite as easy as it might otherwise be
because:

1. lm (and now glm) removes the unused levels from the factor, and

2. predict.lm and predict.glm (or more precisely model.frame.default)
   check levels() for factors in newdata against those used in the
   fit,

3. as opposed to checking the levels ACTUALLY USED in the newdata.

Of course it is not difficult to remove the unused levels from the
factors in newdata, but it would be convenient if predict would do
this automatically, or at least if there was an option to do so.

So, am I missing something obvious, or would this be a good thing to
do?

-- 
Brett Presnell
Department of Statistics
University of Florida
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._