[R] Quality of fit statistics for NLS?

Fri Jan 27 10:58:04 CET 2012

On Jan 26, 2012, at 22:51 , Bert Gunter wrote:

> Inline below.
> 
> -- Bert
> 
> On Thu, Jan 26, 2012 at 12:16 PM, Max Brondfield
> <max.brondfield at gmail.com> wrote:
>> Dear all,
>> I am trying to analyze some non-linear data to which I have fit a curve of
>> the following form:
>> 
>> dum <- nls(y~(A + (B*x)/(C+x)), start = list(A=370,B=100,C=23000))
>> 
>> I am wondering if there is any way to determine meaningful quality of fit
>> statistics from the nls function?
>> 
>> A summary yields highly significant p-values, but it is my impression that
>> these are questionable at best given the iterative nature of the fit:
> No. They are questionable primarily because there is no clear null
> model. They are based on profile likelihoods (as ?confint tells you),
> which may or may not be what you want for "goodness of fit."
> 
> One can always get "goodness of fit" statistics but the question in
> nonlinear models is: goodness of fit with respect to what? So the
> answer to your question is: if you know what you're doing, certainly.
> Otherwise, find someone who does.

...and if you are in the process of learning what you are doing: p-values are almost _never_ a good measure of goodness-of-fit, whereas the residual standard error might be, especially if you take a prediction approach to things. For one-dimensional curve fits, a graph of the data with the fitted curve is often what is really needed.

Also notice that summaries of fitted models are not useful for detecting systematic deviations from the model (like systematic over/under-estimation in some regions), for that you need diagnostic plots, and/or comparisons with extended models.

> 
> 
> 
>> 
>>> summary(dum)
>> 
>> Formula: y ~ (A + (B * x)/(C + x))
>> 
>> Parameters:
>>   Estimate Std. Error t value Pr(>|t|)
>> A   388.753      4.794  81.090  < 2e-16 ***
>> B   115.215      5.006  23.015  < 2e-16 ***
>> C 20843.832   4646.937   4.485 1.12e-05 ***
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Residual standard error: 18.25 on 245 degrees of freedom
>> 
>> Number of iterations to convergence: 4
>> Achieved convergence tolerance: 2.244e-06
>> 
>> 
>> Is there any other means of determining the quality of the curve fit? I
>> have tried applying confidence intervals using confint(dum), but these
>> curves seem unrealistically narrow. Thanks so much for your help!
>> -Max
>> 
>>        [[alternative HTML version deleted]]
>> 
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com