[R] NLS

Douglas Bates bates at stat.wisc.edu
Fri Nov 10 04:24:24 CET 2000

```Zsombor Cseres-Gergely <z.cseres-gergely at ucl.ac.uk> writes:

> I try to do a very simple nonlinear regression. The function is
>
> y = (b0 + b1*x1 + b2*x2 + b3*x3) * x4^b4

Are you taking advantage of the fact that four of your five parameters
are conditionally linear?  You can use
algorithm = "plinear"
to indicate to the nls function that your model is partially linear
like this one is.  When this option is user you only need to specify a
starting estimate for b4 and the optimization is reduced to a
one-dimensional optimization of the profiled residual sum-of-squares.

You would write the model as

nls(y ~ x4^b4*cbind(1, x1, x2, x3), data = mydata, start = c(b4 = 0),
alg = "plinear", trace = TRUE)

> I think I do everything well, but as I set the starting value of b4 to 0 (it
> is the theoretically sane starting value),

Do you really expect 0 to be a sensible value for this parameter?  If
so, have you already fit the linear regression model
y ~ 1 + x1 + x2 + x3
and found it to be adequate?  Why then do you think that x4 determines
the response in this fashion is your best guess at the value of b4 is
the value that makes x4 of no consequence.

Do you actually know so little about these data that you can't tell if
you expect b4 to be negative or to be positive?

One does not choose starting estimates in a nolinear regression
because they are theoretically possible values.  One uses every
possible trick to come up with values that are consistent with the
observed data.

> it converges very quickly, and to the wrong solution.

Please explain this further.  An independent evaluation of the
nonlinear least squares algorithms in several major statistical and
econometrics packages by Bruce McCullough found that the algorithm and
convergence criterion used in S-PLUS (and in R) was one of two that
did *not* declare convergence to incorrect values (in the sense that
one of more of the "converged" parameter estimates had zero correct
significant digits) on at least one test problem.

> Wrong in a sense, that 1) we do not expect this and 2) we
> do not get this on E-Views, Stata and SAS. I do not use any extra setting,
> just the plain default. I did several regressions choosing starting values for
> b4 on the seq(-1,1,.01) series. It did find the correct values (with
> `globally' smallest RSS), but the result is strongly dependent on the initial
> values. Morover, the good result comes from a `bad' initial value!  I have
> read that the nonlinear optimizer/minimizer will change in the future, but

Not that I am aware of.  However, R is an open source system and you
are welcome to contribute a superior nonlinear least squares
implementation at any time.

> this is funny. And it happens when I use R-devel, anyway.  Anyone