[R] Automatization of non-linear regression

Douglas Bates bates at stat.wisc.edu
Thu Oct 22 23:43:24 CEST 2009


Alternatively you could install the NRAIA package and collapse the
construction of the plot to a call to plotfit as shown in the
enclosed.

Note that this is a poor fit of a logistic growth model.  There are no
data values beyond the inflection point so the Asym parameter (the
asymptote) is poorly determined and the asymptote and the inflection
point on the x axis (the xmid parameter) will be highly correlated.
Furthermore the increasing variability is due to differences between
plants.  You can check with

xyplot(severity ~ time, data1, groups = plant, type = c("g", "b"))

so a nonlinear mixed-effects model may be more appropriate.  See
Pinheiro and Bates (Springer, 2000).

2009/10/22 Peter Ehlers <ehlers at ucalgary.ca>:
> Joel,
>
> The following should answer most of your questions:
>
>  time <- c(seq(0,10),seq(0,10),seq(0,10))
>  plant <- c(rep(1,11),rep(2,11),rep(3,11))
>  severity <- c(
>        42,51,59,64,76,93,106,125,149,171,199,
>        40,49,58,72,84,103,122,138,162,187,209,
>        41,49,57,71,89,112,146,174,218,250,288)/288
>
> # Suggestion: you don't need cbind() to make a dataframe:
>
>  data1 <- data.frame(time, plant, severity)
>
> # Plot severity versus time
> # you can avoid the awkward ..$.. by using with():
>
>  with(data1, plot(time, severity, type="n"))
>  with(data1, text(time, severity, plant))
>  title(main="Graph of severity vs time")
>
> # Now you use either the 'getInitial' approach and
> # transform your parameter estimates to your
> # preferred ones, or you can use SSlogis for your
> # model and transform the parameter estimates
> # afterwards:
>
> # Method 1:
> # --------
>  ini <- getInitial(
>        severity ~ SSlogis(time, alpha, xmid, scale),
>        data = data1
>  )
>  ini <- unname(ini) ##to get rid of names
>
> # start vector in terms of alpha, beta, gamma:
>  para0.st <- c(
>         alpha = ini[1],
>         beta  = ini[2]/ini[3],
>         gamma = 1/ini[3]
>  )
>
>  fit0 <- nls(
>          severity~alpha/(1+exp(beta-gamma*time)),
>          data1,
>          start=para0.st,
>          trace=T
>  )
>
> # logistic curve on the scatter plot
>  co <- coef(fit0)  ##get the parameter estimates
>  curve(co[1]/(1+exp(co[2]-co[3]*x)), add=TRUE)
>
> # you don't need from, to, since the plot is already
> # set up and you don't want to restrict the curve;
>
> # personally, I prefer defining the function to be plotted:
>
>  f <- function(x, abc){
>    alpha <- abc[1]
>    beta  <- abc[2]
>    gamma <- abc[3]
>    alpha / (1 + exp(beta - gamma * x))
>  }
>
> # then you can plot the curve with:
>
>  curve(f(x, coef(fit0)), add = TRUE)
>
>
> # Method 2:
> # --------
>  fit2 <- nls(severity ~ SSlogis(time, Asym, xmid, scal), data = data1)
>  co <- unname(coef(fit2))
>  abc <- c(co[1], co[2]/co[3], 1/co[3])
>  with(data1, plot(time, severity, type="n"))
>  with(data1, text(time, severity, plant))
>  title(main="Graph of severity vs time")
>  curve(f(x, abc), add = TRUE)
>
>
> Cheers,
> -Peter Ehlers
>
> Joel Fürstenberg-Hägg wrote:
>>
>> Hi everybody,
>>
>>
>> I'm using the method described here to make a linear regression:
>>
>>
>>
>> http://www.apsnet.org/education/advancedplantpath/topics/Rmodules/Doc1/05_Nonlinear_regression.html
>>
>>
>>>
>>> ## Input the data that include the variables time, plant ID, and severity
>>> time <- c(seq(0,10),seq(0,10),seq(0,10))
>>> plant <- c(rep(1,11),rep(2,11),rep(3,11))
>>>
>>> ## Severity represents the number of
>>> ## lesions on the leaf surface, standardized
>>> ## as a proportion of the maximum
>>> severity <- c(
>>
>> +         42,51,59,64,76,93,106,125,149,171,199,
>> +         40,49,58,72,84,103,122,138,162,187,209,
>> +         41,49,57,71,89,112,146,174,218,250,288)/288
>>>
>>> data1 <- data.frame(
>>
>> +         cbind(
>> +                 time,
>> +                 plant,
>> +                 severity
>> +         )
>> + )
>>>
>>> ## Plot severity versus time ## to see the relationship between## the two
>>> variables for each plant
>>> plot(
>>
>> +         data1$time,
>> +         data1$severity,
>> +         xlab="Time",
>> +         ylab="Severity",
>> +         type="n"
>> + )
>>>
>>> text(
>>
>> +         data1$time,
>> +         data1$severity,
>> +         data1$plant
>> + )
>>>
>>> title(main="Graph of severity vs time")
>>>
>>> getInitial(
>>
>> +         severity ~ SSlogis(time, alpha, xmid, scale),
>> +         data = data1
>> + )
>>    alpha      xmid     scale  2.212468 12.506960  4.572391
>>>
>>> ## Using the initial parameters above,
>>> ## fit the data with a logistic curve.
>>> para0.st <- c(
>>
>> +         alpha=2.212,
>> +         beta=12.507/4.572, # beta in our model is xmid/scale
>> +         gamma=1/4.572 # gamma (or r) is 1/scale
>> + )
>>>
>>> fit0 <- nls(
>>
>> +         severity~alpha/(1+exp(beta-gamma*time)),
>> +         data1,
>> +         start=para0.st,
>> +         trace=T
>> + )
>> 0.1621433 :  2.2120000 2.7355643 0.2187227 0.1621427 :  2.2124095
>> 2.7352979 0.2187056
>>>
>>> ## Plot to see how the model fits the data; plot the
>>> ## logistic curve on a scatter plot
>>> plot(
>>
>> +         data1$time,
>> +         data1$severity,
>> +         type="n"
>> + )
>>>
>>> text(
>>
>> +         data1$time,
>> +         data1$severity,
>> +         data1$plant
>> + )
>>>
>>> title(main="Graph of severity vs time")
>>>
>>> curve(
>>
>> +         2.21/(1+exp(2.74-0.22*x)),
>> +         from=time[1],
>> +         to=time[11],
>> +         add=TRUE
>> + )
>>
>>
>> As you can see I have to do some work manually, such as setting the
>> numbers to be used for calculation of alpha, beta and gamma. I wonder if you
>> might have an idea how to automatize this? I suppose it should be possible
>> to save the output from getInitial() and reach the elements via index or
>> something, but how?
>>
>>
>> I guess a similar approach could be used for the values of fit0?
>>
>>
>> Or even better, if the variables alpha, beta and gamma could be used right
>> away for instance in curve(), instead of adding the values manually. But
>> just exchanging the values with the varables (alpha instead of 2.21 etc)
>> doesn't seem to work. What is the reason for that? Any solution?
>>
>>
>> A last, general but somewhat related question. If I set variables in a
>> function such as para0.st <- c(alpha=2.212, ...), is it just stored locally,
>> or can it be used globally, I mean, can I use the variable anywhere (for
>> instance in curve()) or just in the function where it was created? I'm
>> asking because I'm used to Java, where the life time of local variabels only
>> extends to the closing braces, while global variables can be reached
>> everywhere.
>>
>>
>> The reason for automatization is that I'll have to repeat the procedure
>> more than a hundred times, while making overview pair waise plots of my
>> data, with both this logaritmic regression and several others (exponential,
>> monomoelcular, logistic, Gompertz and Weibull).
>>
>>
>>
>> Wish you all the best,
>>
>>
>> Joel
>>
>>  _________________________________________________________________
>> Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig!
>> http://windows.microsoft.com/shop
>>        [[alternative HTML version deleted]]
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-------------- next part --------------
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

 - MKL libraries for acclerated math installed: see help (setMKLthreads)
 - ParallelR packages installed: see help (package='foreach')

Type 'revo()' to visit www.revolution-computing.com for the latest
REvolution R news, 'forum()' for the community forum, or 'readme()'
for release notes.

> library(NRAIA)
Loading required package: lattice
> data1 <- data.frame(time = rep(0:10, 3),
+                     plant = rep(1:3, each = 11),
+                     severity = c(42,51,59,64,76,93,106,125,149,171,199,
+                     40,49,58,72,84,103,122,138,162,187,209,
+                     41,49,57,71,89,112,146,174,218,250,288)/288)
> summary(fit2 <- nls(severity ~ SSlogis(time, Asym, xmid, scal), data1))

Formula: severity ~ SSlogis(time, Asym, xmid, scal)

Parameters:
     Estimate Std. Error t value Pr(>|t|)    
Asym    2.212      1.993   1.110 0.275813    
xmid   12.507      6.929   1.805 0.081100 .  
scal    4.572      1.189   3.845 0.000583 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Residual standard error: 0.07352 on 30 degrees of freedom

Number of iterations to convergence: 0 
Achieved convergence tolerance: 5.513e-07 

> plotfit(fit2)
> 
> proc.time()
   user  system elapsed 
  0.960   0.070   1.007 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplots.pdf
Type: application/pdf
Size: 7371 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091022/285d4b50/attachment-0002.pdf>


More information about the R-help mailing list