[R] nls: different results if applied to normal or linearized data

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Mar 6 07:03:34 CET 2008


The only thing you are adding to earlier replies is incorrect:

 	fitting by least squares does not imply a normal distribution.

For a regression model, least-squares is in various senses optimal when 
the errors are i.i.d. and normal, but it is a reasonable procedure for 
many other situations (but not for modestly long-tailed distributions, 
the point of robust statistics).

Although values from -Inf to +Inf are theoretically possible for a normal, 
it has very little mass in the tails and is often used as a model for 
non-negative quantities (and e.g. the justification of Box-Cox estimation 
relies on this).

On Wed, 5 Mar 2008, Martin Elff wrote:

> On Wednesday 05 March 2008 (14:53:27), Wolfgang Waser wrote:
>> Dear all,
>>
>> I did a non-linear least square model fit
>>
>> y ~ a * x^b
>>
>> (a) > nls(y ~ a * x^b, start=list(a=1,b=1))
>>
>> to obtain the coefficients a & b.
>>
>> I did the same with the linearized formula, including a linear model
>>
>> log(y) ~ log(a) + b * log(x)
>>
>> (b) > nls(log10(y) ~ log10(a) + b*log10(x), start=list(a=1,b=1))
>> (c) > lm(log10(y) ~ log10(x))
>>
>> I expected coefficient b to be identical for all three cases. Hoever, using
>> my dataset, coefficient b was:
>> (a) 0.912
>> (b) 0.9794
>> (c) 0.9794
>>
>> Coefficient a also varied between option (a) and (b), 107.2 and 94.7,
>> respectively.
>
> Models (a) and (b) entail different distributions of the dependent variable y
> and different ranges of values that y may take.
> (a) implies that y has, conditionally on x, a normal distribution and
> has a range of feasible values from -Inf to +Inf.
> (b) and (c) imply that log(y) has a normal distribution, that is,
> y has a log-normal distribution and can take values from zero to +Inf.
>
>> Is this supposed to happen?
> Given the above considerations, different results with respect to the
> intercept are definitely to be expected.
>
>> Which is the correct coefficient b?
> That depends - is y strictly non-negative or not ...
>
> Just my 20 cents...
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list