[R] predict "interval" for lmRob?

Greg Snow Greg.Snow at imail.org
Wed Apr 8 19:20:11 CEST 2009


Your problem is related to the theory underlying linear models (and is an example as to why it is important to understand the theory, not just know how to plug numbers into a computer).

The lm function is based on theory that assumes the y variable in normally distributed with the mean of that normal based on the model and the x values.  This allows the predict function for lm to create prediction intervals based on the normal distribution, the predicted mean of that distribution, the estimated standard deviation, and the uncertainty in the predicted mean.  Note that if your y variable is not normally distributed, but the sample size is large enough for the Central Limit Theorem to hold, then the confidence intervals will be approximately correct, but the prediction intervals will probably not be.

When you switch to a robust regression approach, the assumption is that the y variable is not normal, so a prediction interval based on the normal distribution does not make sense.  To get an appropriate prediction interval you need some information on what the distribution of the y values is (conditional on the model), but most robust techniques are not based on a specific distribution, just some properties of the distribution.  Without some information (or at least an assumption) on that distribution, the predict method cannot create prediction intervals.

I know that this does not answer your question, but hopefully helps you to understand what is happening.  Think about what your actual scientific question is, it may be that you can answer the question without prediction intervals.

If you feel that you really need the prediction intervals, then you will need to do some additional background research into what distribution you think the data comes from, then you can proceed from there.  Some options include fitting a model based on that distribution, simulating data from the distribution given the model estimates and the uncertainty in those estimates, quantile regression, mixture of regressions, and others.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Galkowski, Jan
> Sent: Wednesday, April 08, 2009 9:32 AM
> To: r-help at r-project.org
> Subject: [R] predict "interval" for lmRob?
> 
> lm's "predict" function offers an "interval" parameter to choose
> between 'confidence' and 'prediction' bands. In the package "robust"
> and for "lmRob", there is also a "predict" but it lacks such a
> parameter, and the documented "type" parameter has only "response"
> offerred.  Is there some way of obtaining prediction bands from lmRob?
> Is there an alternative robust (linear) regression package that offers
> such a capability?
> 
> Thanks for any and all help.
> 
>   - Jan Galkowski, Akamai Technologies, Cambridge, MA.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list