[R] Prediction intervals (i.e. not CI of the fit) for monotonic loess curve using bootstrapping

Jan Stanstrup jan.stanstrup at fmach.it
Mon Aug 18 13:56:12 CEST 2014


The knots are deleted anyway ("Deleting unnecessary knots ..."). It 
seems to make no difference.



On 08/14/2014 06:06 PM, David Winsemius wrote:
>
> On Aug 14, 2014, at 7:17 AM, Jan Stanstrup wrote:
>
>> Thank you very much for this snippet!
>>
>> I used it on my data and indeed it does give intervals which 
>> appear quite realistic (script and data here 
>> https://github.com/stanstrup/retpred_shiny/blob/master/retdb_admin/make_predictions_CI_tests.R).
>> I also tried getting the intervals with predict.cobs but the 
>> methods available there gave very narrow bands.
>> The only problem I can see is that the fit tend to be a bit on 
>> the smooth side. See for example the upper interval limits at x = 2 
>> to 3 and x =1.2. If then I set lambda to something low like 0.05 
>> the band narrows to nearly nothing when there are few points. For 
>> example at x = 2.5. Is there some other parameter I would be adjusting?
>>
>
> Try specifying the number and location of the knots (using my example 
> data):
>
> > Rbs.9 <- cobs(age,analyte,constraint="increase",tau=0.9, nknots=6, 
> knots=seq(60,85,by=5))
> > plot(age,analyte, ylim=c(0,2000))
> >  lines(predict(Rbs.9), col = 2, lwd = 1.5)
>
>
> -- 
> David.
>
>>
>>
>> ----------------------
>> Jan Stanstrup
>> Postdoc
>>
>> Metabolomics
>> Food Quality and Nutrition
>> Fondazione Edmund Mach
>>
>>
>>
>> On 08/14/2014 02:02 AM, David Winsemius wrote:
>>>
>>> On Aug 12, 2014, at 8:40 AM, Bert Gunter wrote:
>>>
>>>> PI's of what? -- future individual values or mean values?
>>>>
>>>> I assume quantreg provides quantiles for the latter, not the former.
>>>> (See ?predict.lm for a terse explanation of the difference).
>>>
>>> I probably should have questioned the poster about what was meant by 
>>> a "prediction interval for a monotonic loess curve". I was 
>>> suggesting quantile regression for estimation of a chosen quantile, 
>>> say the 90th percentile. I was thinking it could produce the 
>>> analogue of a 90th percentile value (with no         reference to a 
>>> mean value or use of presumed distribution within adjacent windows 
>>> of say 100-150 points. I had experience using the cobs function (in 
>>> the package of the same name) as Koenker illustrates:
>>>
>>> age <- runif(1000,min=60,max=85)
>>>
>>>  analyte <- rlnorm(1000,4*(age/60),age/60)
>>>  plot(age,analyte)
>>>
>>>  library(cobs)
>>>  library(quantreg)
>>>  Rbs.9 <- cobs(age,analyte, constraint="increase",tau=0.9)
>>> Rbs.median <- cobs(age,analyte,constraint="increase",tau=0.5)
>>>
>>> png("cobs.png"); plot(age,analyte, ylim=c(0,2000))
>>>  lines(predict(Rbs.9), col = "red", lwd = 1.5)
>>> lines(predict(Rbs.median), col = "blue", lwd = 1.5)
>>>  dev.off()
>>> <Mail Attachment.png>
>>>
>>> -- David
>>>
>>>
>>>> obtainable from bootstrapping but the details depend on what you are
>>>> prepared to assume. Consult references or your local statistician for
>>>> help if needed.
>>>>
>>>> -- Bert
>>>>
>>>> Bert Gunter
>>>> Genentech Nonclinical Biostatistics
>>>> (650) 467-7374
>>>>
>>>> "Data is not information. Information is not knowledge. And knowledge
>>>> is certainly not wisdom."
>>>> Clifford Stoll
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 12, 2014 at 8:20 AM, David Winsemius 
>>>> <dwinsemius at comcast.net <mailto:dwinsemius at comcast.net>> wrote:
>>>>>
>>>>> On Aug 12, 2014, at 12:23 AM, Jan Stanstrup wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to find a way to estimate prediction intervals (PI) 
>>>>>> for a monotonic loess curve using bootstrapping.
>>>>>>
>>>>>> At the moment my approach is to use the boot function from the 
>>>>>> boot package to bootstrap my loess model, which consist of loess 
>>>>>> + monoproc from the monoproc package (to force the fit to 
>>>>>> be monotonic which gives me much improved results with my 
>>>>>> particular data). The output from the monoproc package is 
>>>>>> simply the fitted y values at each x-value.
>>>>>> I then use boot.ci (again from the boot package) to get 
>>>>>> confidence intervals. The problem is that this gives me 
>>>>>> confidence intervals (CI) for the "fit" (is there a proper way to 
>>>>>> specify this?) and not a prediction interval. The interval is 
>>>>>> thus way too optimistic to give me an idea of the confidence 
>>>>>> interval of a predicted value.
>>>>>>
>>>>>> For linear models predict.lm can give PI instead of CI by setting 
>>>>>> interval = "prediction". Further discussion of that here:
>>>>>> http://stats.stackexchange.com/questions/82603/understanding-the-confidence-band-from-a-polynomial-regression
>>>>>> http://stats.stackexchange.com/questions/44860/how-to-prediction-intervals-for-linear-regression-via-bootstrapping.
>>>>>>
>>>>>> However I don't see a way to do that for boot.ci. Does there 
>>>>>> exist a way to get PIs after bootstrapping? If some sample code 
>>>>>> is required I am more than happy to supply it but I 
>>>>>> thought the question was general enough to be understandable 
>>>>>> without it.
>>>>>>
>>>>>
>>>>> Why not use the quantreg package to estimate the quantiles of 
>>>>> interest to you? That way you would not be depending on Normal 
>>>>> theory assumptions which you apparently don't trust. I've used it 
>>>>> with the `cobs` function from the package of the same name to 
>>>>> implement the monotonic constraint. I think there is a 
>>>>> worked example in the quantreg package, but since I 
>>>>> bought Koenker's book, I may be remembering from there.
>>>>> --
>>>>>
>>>>> David Winsemius
>>>>> Alameda, CA, USA
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting 
>>>>> guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>
>> <boot2ci_PI.png><cobs.png>
>
> David Winsemius
> Alameda, CA, USA
>



More information about the R-help mailing list