[R] Help Choosing Start Values for nls

Peter Dalgaard pdalgd at gmail.com
Sat Aug 21 20:58:51 CEST 2010


On 08/21/2010 08:08 PM, Nick Torenvliet wrote:
> Hi all,
> 
> I'm trying to do a simple curve fit and coming up with some interesting
> results I would like to get comment on.
> So as shown below, tsR is my explanatory and response is... well... my
> response.
> 
> This same data in gnumeric gets fitted with the curve "response=10078.4 +
> 1358.67 * ln (explanatory - 2009.07)
> 
> So I'm using nls with the start values supplied by gnumeric.
> 
> in "First Time Through" I'm very close to the values gnumeric found, I get
> NaNs and Infinities.
> 
> In "Second Time Through" I use the "exact" values given by gnumeric... and
> it all come together fine.
> 
> The difference between start value in First and Second time through is
> practically insignificant and gnumeric was able to fit this without effort.
> 
> At this rate R is basically useless to me -- what can I change here to set R
> up better for this non-linear fit?
> 
> What a shame... such a beautiful curve :-)

Hmm, nls is known to be somewhat finicky in the convergence department.
Sometimes it works better with the "port" algorithm, but in this case, I
think the culprit is pretty clear: Your "C" parameter is precariously
close to the smallest value of tsR so the fitting algorithm will very
easily walk into no-no territory. You could try helping the algorithm
byt changing the log term to log(pmax(tsR-C,.001)).

(Some may also question whether you really believe that the function
should shoot off to minus infinity just to the left of your data...)

A little piece of advice: If you want people to actually try things with
the data, it would be helpful if you gave them using dput(), print()
output can be tricky to read back in.

-pd

> 
> 
> **********************Data*******************************
>> response
>  [1]  7062.93  7608.92  8168.12  8500.33  8447.00  9171.61  9496.28  9712.28
>  [9]  9712.73 10344.84 10428.05 10067.33 10325.26 10856.63 11008.61 10136.63
> [17]  9774.02 10465.94 10319.95
> 
>> tsR
>  [1] 2009.167 2009.250 2009.333 2009.417 2009.500 2009.583 2009.667 2009.750
>  [9] 2009.833 2009.917 2010.000 2010.083 2010.167 2010.250 2010.333 2010.417
> [17] 2010.500 2010.583 2010.667
> 
> ***********************First Time Through*******************************
>> reFit <- nls(response ~ A + B * log(tsR - C),
> start=c(A=10000,B=1350,C=2000))
> Error in numericDeriv(form[[3L]], names(ind), env) :
>   Missing value or an infinity produced when evaluating the model
> In addition: Warning message:
> In log(tsR - C) : NaNs produced
> 
> ***********************Second Time Through*******************************
>> reFit <- nls(response ~ A + B *
> log(tsR-C),start=c(A=10078.4,B=1358.67,C=2009.07))
>> summary(reFit)
> 
> Formula: response ~ A + B * log(tsR - C)
> 
> Parameters:
>    Estimate Std. Error   t value Pr(>|t|)
> A 1.008e+04  1.427e+02    70.628  < 2e-16 ***
> B 1.359e+03  2.997e+02     4.534 0.000339 ***
> C 2.009e+03  7.314e-02 27468.499  < 2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Residual standard error: 388.8 on 16 degrees of freedom
> 
> Number of iterations to convergence: 6
> Achieved convergence tolerance: 6.394e-06
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list