[R] predict.lm(...,type="terms") question

Wed Aug 29 20:36:41 CEST 2012

On Aug 29, 2012, at 8:06 AM, John Thaden wrote:

> Could it be that my newdata object needs to include a column for the
> concn term even though I'm asking for concn to be predicted?

The new data argument MUST contain a column with the name "area". If  
it does not hten the original data is used.

> If so, what numbers would I fill it with?

lm.predict is not set up to do the task you are requesting (as Peter  
Ehlers points out.)

> Or should my newdata object include the original data, too? Is there  
> another mailing list I can ask?

This question has a good chance of having been asked on r-help before.

Why not do a search with MarkMail (using this as the first term:  
list:org.r-project.r-help) or at the Newcastle search site? Ehler's  
suggestion of using the term "inverse prediction" or "calibration  
curve" might be useful in that task.

You can also search the accumulated package documantation with:

installpackages("sos")
require(sos)
findFn("calibration curve")
findFn("inverse prediction")

Doing a search in StackOverflow with [r] calibration curve" also  
brings up some items that look possibly helpful.

-- 
David.

> Thanks,
> -John
>
> On Wed, Aug 29, 2012 at 9:16 AM, John Thaden <johnthaden at gmail.com>  
> wrote:
>> I think I may be misreading the help pages, too, but misreading how?
>>
>> I agree that inverting the fitted model is simpler, but I worry  
>> that I'm
>> misusing ordinary least squares regression by treating my response,  
>> with its
>> error distribution, as a predictor with no such error. In practice,  
>> with my
>> real data that includes about six independent peak area  
>> measurements per
>> known concentration level, the diagnostic plots from  
>> plot.lm(inv.model) look
>> very strange and worrisome.
>>
>> Certainly predict.lm(..., type = "terms") must be able to do what I  
>> need.
>>
>> -John
>>
>> On Wed, Aug 29, 2012 at 6:50 AM, Rui Barradas  
>> <ruipbarradas at sapo.pt> wrote:
>>>
>>> Hello,
>>>
>>> You seem to be misreading the help pages for lm and predict.lm,  
>>> argument
>>> 'terms'.
>>> A much simpler way of solving your problem should be to invert the  
>>> fitted
>>> model using lm():
>>>
>>>
>>> model <- lm(area ~ concn, data)  # Your original model
>>> inv.model <- lm(concn ~ area, data = data)  # Your problem's model.
>>>
>>> # predicts from original data
>>> pred1 <- predict(inv.model)
>>> # predict from new data
>>> pred2 <- predict(inv.model, newdata = new)
>>>
>>> # Let's see it.
>>> plot(concn ~ area, data = data)
>>> abline(inv.model)
>>> points(data$area, pred1, col="blue", pch="+")
>>> points(new$area, pred2, col="red", pch=16)
>>>
>>>
>>> Also, 'data' is a really bad variable name, it's already an R  
>>> function.
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Em 28-08-2012 23:30, John Thaden escreveu:
>>>>
>>>> Hello all,
>>>>
>>>> How do I actually use the output of predict.lm(..., type="terms")  
>>>> to
>>>> predict new term values from new response values?
>>>>
>>>> I'm a  chromatographer trying to use R (2.15.1) for one of the most
>>>> common calculations in that business:
>>>>
>>>>     - Given several chromatographic peak areas measured for control
>>>> samples containing a molecule at known (increasing) concentrations,
>>>>       first derive a linear regression model relating the known
>>>> concentration (predictor) to the observed peak area (response)
>>>>     - Then, given peak areas from new (real) samples containing
>>>> unknown amounts of the molecule, use the model to predict
>>>> concentrations of the
>>>>       molecule in the unknowns.
>>>>
>>>> In other words, given y = mx +b, I need to solve x' = (y'-b)/m  
>>>> for new
>>>> data y'
>>>>
>>>> and in R, I'm trying something like this
>>>>
>>>> require(stats)
>>>> data <- data.frame(area = c(4875, 8172, 18065, 34555), concn =  
>>>> c(25,
>>>> 50, 125, 250))
>>>> new <- data.frame(area = c(8172, 10220, 11570, 24150))
>>>> model <- lm(area ~ concn, data)
>>>> pred <- predict(model, type = "terms")
>>>> #predicts from original data
>>>> pred <- predict(model, type = "terms", newdata = new)
>>>>                 #error
>>>> pred <- predict(model, type = "terms", newdata = new, se.fit =  
>>>> TRUE)
>>>>           #error
>>>> pred <- predict(model, type = "terms", newdata = new, interval =
>>>> "prediction")  #error
>>>> new2 <- data.frame(area = c(8172, 10220, 11570, 24150), concn = 0)
>>>> new2
>>>> pred <- predict(model, type = "terms", newdata = new2)
>>>>                #wrong results
>>>>
>>>> Can someone please show me what I'm doing wrong?
>>>>
-- 

David Winsemius, MD
Alameda, CA, USA