[R] Prediction intervals (i.e. not CI of the fit) for monotonic loess curve using bootstrapping
David Winsemius
dwinsemius at comcast.net
Thu Aug 14 19:22:46 CEST 2014
On Aug 14, 2014, at 9:06 AM, David Winsemius wrote:
>
> On Aug 14, 2014, at 7:17 AM, Jan Stanstrup wrote:
>
>> Thank you very much for this snippet!
>>
>> I used it on my data and indeed it does give intervals which appear quite realistic (script and data here https://github.com/stanstrup/retpred_shiny/blob/master/retdb_admin/make_predictions_CI_tests.R).
>> I also tried getting the intervals with predict.cobs but the methods available there gave very narrow bands.
>> The only problem I can see is that the fit tend to be a bit on the smooth side. See for example the upper interval limits at x = 2 to 3 and x =1.2. If then I set lambda to something low like 0.05 the band narrows to nearly nothing when there are few points. For example at x = 2.5. Is there some other parameter I would be adjusting?
>>
>
> Try specifying the number and location of the knots (using my example data):
>
>> Rbs.9 <- cobs(age,analyte,constraint="increase",tau=0.9, nknots=6, knots=seq(60,85,by=5))
>> plot(age,analyte, ylim=c(0,2000))
>> lines(predict(Rbs.9), col = 2, lwd = 1.5)
My png file seems to have gotten stripped (try again):
(Not responding directly to Jan Stanstrup because his email address seems to force holdup in the moderation queue.)
--
David.
>
>
>
> --
> David.
>
>>
>>
>> ----------------------
>> Jan Stanstrup
>> Postdoc
>>
>> Metabolomics
>> Food Quality and Nutrition
>> Fondazione Edmund Mach
>>
>>
>>
>> On 08/14/2014 02:02 AM, David Winsemius wrote:
>>>
>>> On Aug 12, 2014, at 8:40 AM, Bert Gunter wrote:
>>>
>>>> PI's of what? -- future individual values or mean values?
>>>>
>>>> I assume quantreg provides quantiles for the latter, not the former.
>>>> (See ?predict.lm for a terse explanation of the difference).
>>>
>>> I probably should have questioned the poster about what was meant by a "prediction interval for a monotonic loess curve". I was suggesting quantile regression for estimation of a chosen quantile, say the 90th percentile. I was thinking it could produce the analogue of a 90th percentile value (with no reference to a mean value or use of presumed distribution within adjacent windows of say 100-150 points. I had experience using the cobs function (in the package of the same name) as Koenker illustrates:
>>>
>>> age <- runif(1000,min=60,max=85)
>>>
>>> analyte <- rlnorm(1000,4*(age/60),age/60)
>>> plot(age,analyte)
>>>
>>> library(cobs)
>>> library(quantreg)
>>> Rbs.9 <- cobs(age,analyte, constraint="increase",tau=0.9)
>>> Rbs.median <- cobs(age,analyte,constraint="increase",tau=0.5)
>>>
>>> png("cobs.png"); plot(age,analyte, ylim=c(0,2000))
>>> lines(predict(Rbs.9), col = "red", lwd = 1.5)
>>> lines(predict(Rbs.median), col = "blue", lwd = 1.5)
>>> dev.off()
>>> <Mail Attachment.png>
>>>
>>> -- David
>>>
>>>
>>>> obtainable from bootstrapping but the details depend on what you are
>>>> prepared to assume. Consult references or your local statistician for
>>>> help if needed.
>>>>
>>>> -- Bert
>>>>
>>>> Bert Gunter
>>>> Genentech Nonclinical Biostatistics
>>>> (650) 467-7374
>>>>
>>>> "Data is not information. Information is not knowledge. And knowledge
>>>> is certainly not wisdom."
>>>> Clifford Stoll
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 12, 2014 at 8:20 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>>>>>
>>>>> On Aug 12, 2014, at 12:23 AM, Jan Stanstrup wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to find a way to estimate prediction intervals (PI) for a monotonic loess curve using bootstrapping.
>>>>>>
>>>>>> At the moment my approach is to use the boot function from the boot package to bootstrap my loess model, which consist of loess + monoproc from the monoproc package (to force the fit to be monotonic which gives me much improved results with my particular data). The output from the monoproc package is simply the fitted y values at each x-value.
>>>>>> I then use boot.ci (again from the boot package) to get confidence intervals. The problem is that this gives me confidence intervals (CI) for the "fit" (is there a proper way to specify this?) and not a prediction interval. The interval is thus way too optimistic to give me an idea of the confidence interval of a predicted value.
>>>>>>
>>>>>> For linear models predict.lm can give PI instead of CI by setting interval = "prediction". Further discussion of that here:
>>>>>> http://stats.stackexchange.com/questions/82603/understanding-the-confidence-band-from-a-polynomial-regression
>>>>>> http://stats.stackexchange.com/questions/44860/how-to-prediction-intervals-for-linear-regression-via-bootstrapping.
>>>>>>
>>>>>> However I don't see a way to do that for boot.ci. Does there exist a way to get PIs after bootstrapping? If some sample code is required I am more than happy to supply it but I thought the question was general enough to be understandable without it.
>>>>>>
>>>>>
>>>>> Why not use the quantreg package to estimate the quantiles of interest to you? That way you would not be depending on Normal theory assumptions which you apparently don't trust. I've used it with the `cobs` function from the package of the same name to implement the monotonic constraint. I think there is a worked example in the quantreg package, but since I bought Koenker's book, I may be remembering from there.
>>>>> --
>>>>>
>>>>> David Winsemius
>>>>> Alameda, CA, USA
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>
>> <boot2ci_PI.png><cobs.png>
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list