[R] Problem with 'nls' fitting logistic model (5PL)

Bert Gunter gunter.berton at gene.com
Thu May 3 19:07:59 CEST 2012


Thanks, Keith.

I failed to cc the following reply to John Nash to the list. Your
email persuaded me that it might be useful to do so.

None of this changes the fact that the model is overfitted. You may be
able to get convergence to some set of parameter estates, but they
won't have much meaning since the confidence region will be huge
(better word: "essentially degenerate" -- since it would have
essentially 0 volume in 5d). Predictions should be ok, but if that is
all that is wanted, a more parsimonious model would be better.
Extrapolation is nuts of course.

In my experience(only), scientists tend to overfit nonlinear models.
Failure to converge seems to me to be more appropriate in these
circumstances than providing meaningless parameters.

And,yes I know I'm a curmudgeon. But facilitating bad practice should
always be a concern.

Best,
Bert

On Thu, May 3, 2012 at 9:38 AM, Keith Jewell <k.jewell at campden.co.uk> wrote:
>> ?nls.control
>>  fit<- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
> +  start=c(a=100, b=10000, c=100, d=-1, f=1),
> control=nls.control(warnOnly=TRUE))
> Warning message:
> In nls(MFI ~ a + b/((1 + (nom/c)^d)^f), data = x, weights = x$weights,  :
>  step factor 0.000488281 reduced below 'minFactor' of 0.000976562
>> fit
> Nonlinear regression model
>  model:  MFI ~ a + b/((1 + (nom/c)^d)^f)
>   data:  x
>         a          b          c          d          f
>  165.4360 23866.1247    38.6157    -0.4454     3.2210
>  weighted residual sum-of-squares: 2627977
>
> Number of iterations till stop: 23
> Achieved convergence tolerance: 48.23
> Reason stopped: step factor 0.000488281 reduced below 'minFactor' of
> 0.000976563
>> summary(fit)
>
> Formula: MFI ~ a + b/((1 + (nom/c)^d)^f)
>
> Parameters:
>    Estimate Std. Error t value Pr(>|t|)
> a  1.654e+02  2.742e+04   0.006    0.995
> b  2.387e+04  3.027e+06   0.008    0.994
> c  3.862e+01  3.030e+04   0.001    0.999
> d -4.454e-01  5.754e+01  -0.008    0.994
> f  3.221e+00  1.008e+03   0.003    0.998
>
> Residual standard error: 540.4 on 9 degrees of freedom
>
> Number of iterations till stop: 23
> Achieved convergence tolerance: 48.23
> Reason stopped: step factor 0.000488281 reduced below 'minFactor' of
> 0.000976563
>>
>
> Perhaps nls is a little more stringent than "ANY  reliable statistical
> software" in what, by default, it considers a fit worth reporting?
>
> Keith J
>
> "Michal Figurski" <figurski at mail.med.upenn.edu> wrote in message
> news:4FA2759C.3060806 at mail.med.upenn.edu...
>> Bert,
>>
>> Thank you for your thoughts.
>>
>> I can assure you I have plotted the data back and forth many times, and
>> that overfitting has nothing to do with it. This is not a _statistical_
>> problem, but a _technical_ problem. Something that works well in ANY
>> reliable statistical software doesn't work in R with 'nls'.
>>
>> This just makes me think that 'nls' is a dysfunctional piece of junk that
>> can't handle even a very simple problem. Not to mention that 'predict()'
>> for 'nls' doesn't give you any sort of confidence interval.
>>
>> I wonder if there's another package in R that could be used to fit
>> 5P-logistic model. Any idea?
>>
>> In the end I may attempt to write a fitting function myself, but that
>> would be the last resort.
>>
>> --
>> Michal J. Figurski, PhD
>> HUP, Pathology & Laboratory Medicine
>> Biomarker Research Laboratory
>> 3400 Spruce St. 7 Maloney S
>> Philadelphia, PA 19104
>> tel. (215) 662-3413
>>
>>
>> On 5/2/2012 3:47 PM, Bert Gunter wrote:
>>> Plot the data. You're clearly overfitting.
>>>
>>> (If you don't know what this means or why it causes the problems you
>>> see, try a statistical help list or consult your local statistician).
>>>
>>> -- Bert
>>>
>>> On Wed, May 2, 2012 at 12:32 PM, Michal Figurski
>>> <figurski at mail.med.upenn.edu>  wrote:
>>>> Dear R-Helpers,
>>>>
>>>> I'm working with immunoassay data and 5PL logistic model. I wanted to
>>>> experiment with different forms of weighting and parameter selection,
>>>> which
>>>> is not possible in instrument software, so I turned to R.
>>>>
>>>> I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the
>>>> model
>>>> - I started with the same model and weighting type (1/y) as in the
>>>> instrument to see if I'll get similar results. However, in some
>>>> instances I
>>>> don't get any results - just errors.
>>>>
>>>> Here is an example calibration data, representative of my experiment.
>>>> Instrument soft had no problem fitting it:
>>>> x<- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
>>>> 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c("St1", "St2", "St3",
>>>> "St4", "St5", "St6", "St7"), class = "factor"), MFI = c(10755.5,
>>>> 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58,
>>>> 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24,
>>>> 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683,
>>>> 0.00152454517735542,
>>>> 0.00291686922702965, 0.00308832612723904, 0.0099304865938431,
>>>> 0.00996677740863787, 0.0298507462686567, 0.0332594235033259,
>>>> 0.0697674418604651, 0.0767263427109974, 0.258620689655172,
>>>> 0.263157894736842,
>>>> 1, 1)), .Names = c("SPL", "MFI", "nom", "weights"), row.names = c(NA,
>>>> -14L), class = "data.frame")
>>>>
>>>> And here is the nls fit:
>>>> fit<- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
>>>> start=c(a=100, b=10000, c=100, d=-1, f=1))
>>>>
>>>> I've tried every possible combination of starting values, including the
>>>> values fitted by the instrument soft - to no avail. I've probably seen
>>>> all
>>>> possible error messages from 'nls' trying to fit this.
>>>>
>>>> If anyone has an idea why it's not working - let me know.
>>>>
>>>> Best regards,
>>>>
>>>> --
>>>> Michal J. Figurski, PhD
>>>> HUP, Pathology&  Laboratory Medicine
>>>> Biomarker Research Laboratory
>>>> 3400 Spruce St. 7 Maloney S
>>>> Philadelphia, PA 19104
>>>> tel. (215) 662-3413
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list