[Rd] Issue with prediction from lm object with poly

Ulrike Grömping groemping at bht-berlin.de
Tue Aug 3 21:24:47 CEST 2010


DDear developeRs,

about a year ago, Alex Stolpovsky posted an issue with predict.lm on a 
fit generated using poly with the raw=TRUE option and too few new data 
(slightly modified reproducible example below). Alex did not get any 
reply. I have just stumbled on the same problem, and I think that this 
is a bug of function poly, which arises from the check whether the 
polynomial degree is compatible with the number of unique data points.

Unfortunately, poly is the only comfortable way of automatically 
guaranteeing compatibility between values for different degree 
polynomials in newdata. With the I(x^2) solution for the square, 
compatibility must be manually ascertained. I got stuck with this for 
Russ Lenth's implementation of contour.lm in package rsm, where 
incompatible linear and squared values are fixed, when using standard 
quadratic terms (like x + I(x^2)) in the formula, because I(x^2) does 
not remember it's relation to x. I tried to fix this by using 
poly(x,2,raw=TRUE) instead of x + I(x^2), which failed because of the 
issue raised by Alex Stolpovsky. I think it would be desirable to make 
prediction with poly and the raw=TRUE option work. In my opinion, this 
is more important than checking for admissibility of the degree of the 
polynomial.

A related question: wouldn't it be more logical, if model.matrix would 
return the complete model matrix with all polynomial degrees, but 
model.frame would only return the original data used to generate the 
poly matrix? This might be a change with massive consequences and 
therefore undesirable, but ...

Best, Ulrike

******* the example that shows the issue
x <- c(1:10)
y <- sin(c(1:10))
fit <- lm(formula=y~poly(x, 5, raw=TRUE))
predict(fit, newdata=data.frame(x=c(1:10))) ## this works
predict(fit, newdata=data.frame(x=1))  ## this is broken, error below

Error in poly(x, 5, raw = TRUE) :
  'degree' must be less than number of unique points

The problem is in poly():

    if (raw) {
        if (degree >= length(unique(x)))
            stop("'degree' must be less than number of unique points")



More information about the R-devel mailing list