[Rd] Mismatches in predict(newdata)

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Nov 11 09:38:39 MET 2003


One of the reports recently was of predict.lm misbehaving if 
newdata=data.frame(x=rep(NA, 10)) was given a logical column when it had 
been fitted as a numeric one.

The exact problem was because model.matrix was trying to handle a 0-level 
factor (which is what that logical column got converted to by 
contrasts<-).  However, the problem is more general and I have added to 
R-devel a layer of protection.

When model.frame is called, it adds to its terms attribute an attribute
"dataClasses", and this can be checked against the newdata argument by a 
call to .checkMFClasses:  see lm and predict.lm for how to do so.
Developers who use predict(newdata) may wish to add such code to their 
packages.  (You can use

        if (!is.null(cl <- attr(Terms, "dataClasses")) &&
            exists(".checkMFClasses", envir=NULL)) 
            .checkMFClasses(cl, m)

to be backwards compatible.)

The exact nature of the `classes' is tricky because of inheritance. I have
implemented logical, ordered, factor (not ordered), numeric (not matrix),
nmatrix.n and other: nmatrix.n is a numeric matrix of n columns (as used
by poly() and bs(), for example).  Let me know if you see a need for other 
categories.

Brian

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list