[R] Non-linear Regression best-fit line

peter dalgaard pdalgd at gmail.com
Sat Jun 18 11:05:53 CEST 2011


On Jun 17, 2011, at 23:14 , Sean Bignami wrote:

> I am trying to fit a curve to a cumulative mortality curve (logistic) where y is the cumulative proportion of mortalities, and t is the time in hours (see below). Asym. at 0 and 1
>> y
> [1] 0.00000000 0.04853859 0.08303777 0.15201970 0.40995074 0.46444992 0.62862069 0.95885057 1.00000000
> [10] 1.00000000 1.00000000
>> t
> [1]  0 13 20 24 37 42 48 61 72 86 90
> 
> I tried to find starting values for a Gompertz non-linear regression by converting the equation (y~1*exp(-beta*exp(-gamma*t)) to a linear form per Dalgaard "Introductory Statistics with R" pg.279-280. But got this Error message:
> 
>> lm(log(0-log(y))~t)
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
>  NA/NaN/Inf in foreign function call (arg 4)
> 
> I tried to change all by 0 and 1 values to non-zero and non-one values (yy and tt below), and was able to get starting estimates.
> 
>> yy<-c(0.000000001,0.04853859, 0.08303777, 0.15201970, 0.40995074, 0.46444992, 0.62862069, 0.95885057, 0.9999999999,0.999999999999,0.999999999999)
>> tt<-c(0.0000000001,13,20,24,37,42,48,61,72,86,90)
> 
>> lm(log(0-log(yy))~tt)
> 
> Call:
> lm(formula = log(0 - log(yy)) ~ tt)
> 
> Coefficients:
> (Intercept)           tt  
>     9.5029      -0.3681 
> 
> However, when I plug those values into the nls function, I get an error message about the "getInitial" method
> 
>> nlsout<-nls(y~1*exp(-beta*exp(-gamma*t),start=c(beta=exp(9.5),gamma=.368)));summary(nlsout)
> Error in getInitial.default(func, data, mCall = as.list(match.call(func,  : 
>  no 'getInitial' method found for "function" objects

Not really, but getting your parentheses right on the nls call should at least get you closer. There are 3 "(" before "start" but only 1 ")", so the "start" bit is part of the formula, etc.

Trying that, I got

> nlsout<-nls(yy~1*exp(-beta*exp(-gamma*tt)),start=list(beta=exp(9.5),gamma=.368))
Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model

so you still have some way to go. (I've been using the yy/tt variables because they were easier to paste in from your post. Shouldn't matter much.)

Offhand, I'd say that your procedure for finding starting values is still too sensitive to the zeros and ones. If you do 

plot(tt, log(-log(yy)) )

you will see that the last three points (the 1s in your original data) fall way off the linear trend. So how about omitting them rather than putting in an arbitrary value? And get rid of the zero too.

> lm(log(-log(yy))~tt,subset=2:8)

Call:
lm(formula = log(-log(yy)) ~ tt, subset = 2:8)

Coefficients:
(Intercept)           tt  
     2.5870      -0.0807  


> nlsout<-nls(yy~1*exp(-beta*exp(-gamma*tt)),start=list(beta=exp(2.5),gamma=.08))
> summary(nlsout)
Formula: yy ~ 1 * exp(-beta * exp(-gamma * tt))

Parameters:
      Estimate Std. Error t value Pr(>|t|)    
beta  13.14634    4.61767    2.85    0.019 *  
gamma  0.07284    0.00869    8.38  1.5e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.0568 on 9 degrees of freedom

Number of iterations to convergence: 10 
Achieved convergence tolerance: 4.29e-06 



-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list