[R] nls problems (formerly no subject)

Douglas Bates bates at stat.wisc.edu
Thu Aug 28 19:36:50 CEST 2003


I agree with what you said about using trace = TRUE when you are
having trouble getting nls to converge.  This allows you to see what
is happening to the parameters during the iterations and that it is
often quite instructive; as is plotting your data and thinking about
whether you should expect to be able to estimate all the parameters in
your model from the data that you have.  The nls function is not
magic.  It can only use that information that is available in the
data.

Spencer Graves <spencer.graves at pdf.com> writes:

> Also, have you considered using "optim" first, then feeding
> the answers to "nls"?  McCullough found a few years ago that it was
> easier for him to get answers if he did it that way, because the
> S-Plus versions of "nls" seems to get lost and quit prematurely, while
> "optim" will at least produce an answer.  If I'm not mistaken, this
> issue is discussed in either McCullough, B. D. (1999) Assessing the
> reliability of statistical software: Part II The American
> Statistician, 53, 149-159 or McCullough, B. D. (1998) Assessing the
> reliability of statistical software: Part I The American Statistician,
> 52, 358-366.  I don't remember now which paper had this, but I believe
> one of them did;  I think I'd look at the second first.  (McCullough
> discussed "nlminb" instead of "optim".  The former has been replaced
> by the latter in R.)

I would be hesitant to draw too many conclusions from McCullough's
experiences.  He used the data sets, models, and starting estimates
from the NIST nonlinear least squares test sets.  These are available
in the NISTnls package for R.  If you run the examples in that package
you will see that all of the examples can be fit reasonably easily in
R.  However, the people at NIST decided that they wanted to create
"easy" and "difficult" versions of each example and they did this by
chosing one reasonable set of starting estimates nd one ridiculous set
of starting estimates for each problem.

Check, for example,

library(NISTnls)
example(Eckerle4)

The model that is being fit is a scaled version of a Gaussian
density.  A mere glance at the data shows that the location parameter
will be around 450 and the scale parameter will be around 5.  There is
no problem at all converging from those starting values - that's the
"easy" version of the problem.  The "difficult" version of the problem
is to try to converge from starting estimates of 500 for the location
and 10 for the scale.  

Look at a plot of the data.  Would anyone who knew what those
parameters represent consider 500 as a reasonable guess of the
location of the peak?  I don't feel that it is a terrible deficiency
in the nls implementation in R that it does not converge on this
model/data combination from those starting estimates.




More information about the R-help mailing list