[R] predict.lm if regression vector is longer than predicton vector

William Dunlap wdunlap at tibco.com
Wed Oct 3 17:47:37 CEST 2012


This can happen if your newdata data.frame does not include
all the predictors required by the formula in the model.  In that
case predict will look in the current evaluation environment to
find the missing predictors, and those will generally not match
what is in your newdata.   E.g.,

> x1 <- 1:6
> x2 <- 1/(1:6)
> y <- log(1:6)
> fit <- lm(y ~ x1 + x2)
> predict(fit)
           1            2            3            4            5            6 
-0.008176128  0.725397589  1.089747865  1.361792281  1.596914353  1.813575253 
> predict(fit, newdata=data.frame(x2=1:5)) # didn't supply x1
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : 
  variable lengths differ (found for 'x2')
In addition: Warning message:
'newdata' had 5 rows but variable(s) found have 6 rows

Put all the required variables into newdata and things are fine
> predict(fit, newdata=data.frame(x2=1:5, x1=sin(1:5)))
         1          2          3          4          5 
-0.0366699 -1.1321492 -2.3778906 -3.6469522 -4.7909516

You can also get this problem if newdata is an environment or list
instead of a data.frame, because only data.frame forces all of
its components to have the same length.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of frauke
> Sent: Wednesday, October 03, 2012 7:37 AM
> To: r-help at r-project.org
> Subject: [R] predict.lm if regression vector is longer than predicton vector
> 
> Hi everybody,
> 
> recently a member of the community pointed me to the useful predict.lm()
> comment. While I was toying with it, I stumbled across the following
> problem.
> I do the regression with data from five years. But I want to do a prediction
> with predict.lm for only one year. Thus my dataframe for predict.lm(mod,
> newdata=dataframe) is shorter than the orginial vector that I did the
> regression with. It gives you the following error:
> Warning message:
> 'newdata' had 365 rows but variable(s) found have 1825 rows
> Of course I can extend the new dataframe with a few thousands NAs, but is
> there a more elegant solution?
> 
> Thank you! Frauke
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/predict-lm-if-regression-
> vector-is-longer-than-predicton-vector-tp4644881.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list