[R] Prediction intervals (i.e. not CI of the fit) for monotonic loess curve using bootstrapping

Wed Aug 13 16:59:41 CEST 2014

Thanks to all of you for your suggestions and comments. I really 
appreciate it.

Some comments to Dennis' comments:
1) I am not concerned about predicting outside the original range. That 
would be nonsense anyway considering the physical phenomenon I am 
modeling. I am, however, concerned that the bootstrapping leads to 
extremely wide CIs at the extremes of the range when there are few data 
points. But I guess there is not much I can do about that as long as I 
rely on bootstrapping?

2) I have made a function that does the interpolation to the requested 
new x's from the original modeling data to get the residual variance and 
the model variance. Then it interpolates the combined SDs back the the 
new x values. See below.

3) I understand that. For this project it is not that important that the 
final prediction intervals are super accurate. But I need to hit the 
ballpark. I am only trying to do something that doesn't crossly 
underestimate the prediction error and doesn't make statisticians loose 
their lunch a first glance.
I also cannot avoid that my data contains erroneous values and I will 
need to build many models unsupervised. But the fit should be good 
enough that I plan to eliminate values outside some multiple of the 
prediction interval and then re-calculate. And if the model is not good 
in any range I will throw it out completely.

Based on the formula of my last message I have made a function that at 
least gives less optimistic intervals than what I could get with the 
other methods I have tried. The function and example data can be found 
here 
https://github.com/stanstrup/retpred_shiny/blob/master/retdb_admin/make_predictions_CI_tests.R 
in case anymore has any comments, suggestions or expletives to my 
implementation.

----------------------
Jan Stanstrup
Postdoc

Metabolomics
Food Quality and Nutrition
Fondazione Edmund Mach

On 08/12/2014 05:40 PM, Bert Gunter wrote:
> PI's of what? -- future individual values or mean values?
>
> I assume quantreg provides quantiles for the latter, not the former.
> (See ?predict.lm for a terse explanation of the difference). Both are
> obtainable from bootstrapping but the details depend on what you are
> prepared to assume. Consult references or your local statistician for
> help if needed.
>
> -- Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
>
>
>
>
> On Tue, Aug 12, 2014 at 8:20 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>> On Aug 12, 2014, at 12:23 AM, Jan Stanstrup wrote:
>>
>>> Hi,
>>>
>>> I am trying to find a way to estimate prediction intervals (PI) for a monotonic loess curve using bootstrapping.
>>>
>>> At the moment my approach is to use the boot function from the boot package to bootstrap my loess model, which consist of loess + monoproc from the monoproc package (to force the fit to be monotonic which gives me much improved results with my particular data). The output from the monoproc package is simply the fitted y values at each x-value.
>>> I then use boot.ci (again from the boot package) to get confidence intervals. The problem is that this gives me confidence intervals (CI) for the "fit" (is there a proper way to specify this?) and not a prediction interval. The interval is thus way too optimistic to give me an idea of the confidence interval of a predicted value.
>>>
>>> For linear models predict.lm can give PI instead of CI by setting interval = "prediction". Further discussion of that here:
>>> http://stats.stackexchange.com/questions/82603/understanding-the-confidence-band-from-a-polynomial-regression
>>> http://stats.stackexchange.com/questions/44860/how-to-prediction-intervals-for-linear-regression-via-bootstrapping.
>>>
>>> However I don't see a way to do that for boot.ci. Does there exist a way to get PIs after bootstrapping? If some sample code is required I am more than happy to supply it but I thought the question was general enough to be understandable without it.
>>>
>> Why not use the quantreg package to estimate the quantiles of interest to you? That way you would not be depending on Normal theory assumptions which you apparently don't trust. I've used it with the `cobs` function from the package of the same name to implement the monotonic constraint. I think there is a worked example in the quantreg package, but since I bought Koenker's book, I may be remembering from there.
>> --
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.