[R] predict.lm(...,type="terms") question

Rui Barradas ruipbarradas at sapo.pt
Sat Sep 1 13:33:00 CEST 2012


Hello,

John is right, there seems to be an error in predict.lm. It can be made 
to work but if the model is fitted with lm(...,data) then it messes 
things up. Apparently predict.lm disregards the data argument and uses 
whatever it finds in the global environment. In the examples below, 
after running David's x <- rnorm(15) example (case 1 is a simplified 
version of it), case 3 shows the bug, there are 15 predictions for 4 new 
values. Then we must backtrack to case 2 and also consider it a bug.

Examples:

# ------------------- 1
# David's example simplified, only one regressor.
# And with set.seed()

set.seed(4409)
x <- rnorm(15)
y <- x + rnorm(15)

fit <- lm(y ~ x)
new <- data.frame(y = seq(-3.5, 3.5, 0.5))
predict(fit, newdata = new, type = "terms")
# no error is thrown.

# ------------------- 2
# Original post, 'area' and 'concn' are now 'y' and 'x'
# Remove 'x' and 'y' from the global environment.
rm(x, y)
dat <- data.frame(
     y = c(4875, 8172, 18065, 34555),
     x = c(25, 50, 125, 250))
new <- data.frame(y = c(8172, 10220, 11570, 24150))

model <- lm(y ~ x, data = dat)
predict(model, newdata = new, type = "terms")
# Error in eval(expr, envir, enclos) : object 'x' not found

# ------------------- 3
# Original post.
# Do NOT remove 'x' and 'y' from the global environment.
set.seed(4409)
x <- rnorm(15)
y <- x + rnorm(15)

dat <- data.frame(
     y = c(4875, 8172, 18065, 34555),
     x = c(25, 50, 125, 250))
new <- data.frame(y = c(8172, 10220, 11570, 24150))

model <- lm(y ~ x, data = dat)
predict(model, newdata = new, type = "terms")
# Warning message:
# 'newdata' had 4 rows but variable(s) found have 15 rows

#------------------- 4
# Original post, no data argument to lm.

y <- c(4875, 8172, 18065, 34555)
x <- c(25, 50, 125, 250)
new <- data.frame(y = c(8172, 10220, 11570, 24150))

model <- lm(y ~ x)
predict(model, newdata = new, type = "terms")
# no warning or error is thrown, predictions seem allright

I haven't looked at the code for predict.lm, and really don't know where 
the error is.
Bug report?

Rui Barradas

Em 01-09-2012 06:40, David Winsemius escreveu:
>
> On Aug 31, 2012, at 3:48 PM, jjthaden wrote:
>
>> On Aug 30, 2012, at 4:35 PM,  David Windemius wrote:
>>
>>>> David said my newdata data frame 'new' must have a column named 
>>>> 'area'.
>>>> It did. Nonetheless predict.lm throws an error with type = "terms" and
>>>> newdata = new. I see nothing in the predict.lm documentation that
>>>> bars this usage. Is there a bug?
>>>
>>> After correcting the error in your definition of the 'area' vector I
>>> get no error from predict.lm().
>>
>>> David.
>>
>> What error did you correct?
>
> This was the code you offered:
>
> #Ludbrook's data set S1 (except renaming
> #his 'x' as 'concn' and his 'y' as 'area')
> S1 <- data.frame(
>     area = c(2.4,2.6,6.0,6.5,8.9,),
>     concn = c(1.1,4,5,8.5,8.5))
>
> There's an extra comma in the "area" vector.
>
>>
>> The newdata data frames in my examples have been pretty simple, and have
>> defined the 'area' vector simply, for instance,
>> new <- data.frame(area = c(8172, 10220, 11570, 24150))
>> new
>> #        area
>> #    1  8172
>> #    2 10220
>> #    3 11570
>> #    4 24150
>>
>> Is something wrong with this?
>
> I don't really know  ....  this appears to be a different problem than 
> you posed earlier. If you would learn to post with context, we might 
> not be in the position of trying to read your mind.
>
>>  Were you able to make predict.lm work with newdata = new and type = 
>> "terms"?
>
> Yes. I have no trouble doing so.
>
> >  x <- rnorm(15); x2=rnorm(15)
> >  y <- x + x2 +rnorm(15)
> >  fit <- lm(y ~ x+x2)
> >  new <- data.frame(x2 = seq(-3.5, 3.5, 0.5)  )
> >  predict(fit, newdata=new,type="terms", terms="x2")
>            x2
> 1  -4.9230035
> 2  -4.1850588
> 3  -3.4471141
> 4  -2.7091694
> 5  -1.9712248
> 6  -1.2332801
> 7  -0.4953354
> 8   0.2426093
> 9   0.9805539
> 10  1.7184986
> 11  2.4564433
> 12  3.1943880
> 13  3.9323326
> 14  4.6702773
> 15  5.4082220
> attr(,"constant")
> [1] 0.4501911
>
>
>>
>> -John
>>
>
>> Sent from the (NOT) R help mailing list (and NOT) archive at Nabble.com.
>
> Hmmm ... Nabble ... definitely part of the problem.
>




More information about the R-help mailing list