[R] How to put factor variables in an nls formula ?

Sat Aug 20 01:04:12 CEST 2005

On 8/19/05, Douglas Bates <dmbates at gmail.com> wrote:
> On 8/19/05, François Morneau <francois.morneau at cirad.fr> wrote:
> > Le 11:12 19/08/2005,Douglas Bates écrit:
> > >On 8/18/05, François Morneau <francois.morneau at cirad.fr> wrote:
> > > > Hello,
> > > >
> > > > I want to fit a Gompertz model for tree diameter growth that depends on a 4
> > > > levels edaphic factor ('Drain') and I don't manage to introduce the factor
> > > > variable in the formula.
> > > > Dinc is the annual diameter increment and D is the Diameter.
> > > >
> > > >  >treestab
> > > >  >     Dinc     D      Drain
> > > >   [1,]  0.03  26.10     2
> > > >   [2,]  0.04  13.05     1
> > > >   [3,]  0.00  24.83     1
> > > >   [4,]  0.00  15.92     4
> > > >   [5,]  0.00  12.25     4
> > > >   [6,]  0.00  11.78     4
> > > >   [7,]  0.00  16.87     4
> > > >   [8,]  0.00  15.12     4
> > > >   [9,] -0.01  13.53     4
> > > > [10,] 0.04  16.55     3
> > > > [11,] 0.025 16.07     3
> > > > [12,] 0.00  30.24     3
> > > > [13,] 0.06  15.28     2
> > > > etc…
> > > >  >contrasts(Drain)<-contr.sum(4)
> > > >  >mymodel<-nls(Dinc~r*(1+Drain)*D*log(Asym/D), data=treestab,
> > > start=list(r=0.05,
> > > > Asym=40))
> > > >
> > > > Error in numericDeriv(form[[3]], names(ind), env) :
> > > >          Missing value or an infinity produced when evaluating the model
> > > > In addition: Warning messages:
> > > > 1: + not meaningful for factors in: Ops.factor(1, Drain)
> > > > 2: + not meaningful for factors in: Ops.factor(1, Drain)
> > >
> > >It is not clear to me what you are trying to do.  Can you give us a
> > >bit more information such as how many parameter estimates you expect
> > >to obtain and what they would represent?
> >
> > I'm trying to estimate the effect of the edaphic factor 'Drain' on tree
> > diameter growth. The relationship between diameter 'D' and diameter
> > increment 'Dinc' follow a Gompertz model written : Dinc= r D log(Asym/D)
> > where 'r' is a parameter related to growth speed and 'Asym' is the diameter
> > max where growth stops.
> >
> > My hypothesis is that 'Asym' doesn't change according to edaphic condition
> > but 'r' does. So I'd like to estimate 'Asym' and four 'r'i parameters for
> > each level i of 'Drain' with 'r'i= r0 + effect of 'Drain'i
> >
> > I hope this is helpful...
> 
> Yes, it is.

Sorry for the non-informative contribution - I hit the "Send" button
when I meant to hit "Cancel".

This is less easy than I thought it would be.  The Puromycin data
provides a similar example in that there are two sets of observations
of concentrations and rates, for treated and untreated cells.  The
experimenters assumption is that the rate should be related to the
concentration via the Michaelis-Menten relationship that depends on
two parameters, Asym and K, and that K should not change with
treatment but Asym will.

To fit that we can use
> nls(rate ~ (Asym + delta*(state == 'treated'))*conc/(K + conc), Puromycin, start = c(Asym = 180, delta = 30, K = 0.05))
Nonlinear regression model
  model:  rate ~ (Asym + delta * (state == "treated")) * conc/(K + conc) 
   data:  Puromycin 
        Asym        delta            K 
166.60407167  42.02596094   0.05797178 
 residual sum-of-squares:  2240.891 

To generalize this it is convenient to form the model matrix with the
indicator columns of the state factor.

> mm <- model.matrix(~ 0 + state, Puromycin)

then use matrix multiplication within a call to drop().

> nls(rate ~ drop(mm %*% c(a1,a2)) * conc/(K + conc), Puromycin, start = c(a1 = 210, a2 = 165, K = 0.05), trace = TRUE)
2822.528 :  210.00 165.00   0.05 
2247.841 :  207.77558541 165.94256936   0.05647425 
2240.997 :  208.49340568 166.51190757   0.05778192 
2240.893 :  208.61339576 166.59293688   0.05794926 
2240.891 :  208.62809965 166.60277853   0.05796917 
2240.891 :  208.62983812 166.60394147   0.05797152 
2240.891 :  208.6300430 166.6040785   0.0579718 
Nonlinear regression model
  model:  rate ~ drop(mm %*% c(a1, a2)) * conc/(K + conc) 
   data:  Puromycin 
         a1          a2           K 
208.6300430 166.6040785   0.0579718 
 residual sum-of-squares:  2240.891 

You can do the same thing defining the model matrix of indicators of
Drain and a vector of length 4 for D.  (By the way, I would avoid the
name 'D' for a variable as that is the name of an in-built function in
R.)

The reason that I said this is more difficult than I thought it would
be is because I thought that the parameters in nls could be vectors
but I haven't been able to get that to work.

An alternative is to use the gnls function from the nlme package which
does allow a parameter to be specified according to the levels of a
factor.

> >
> > > >
> > > > Do I need to use another function instead of nls to correctly include the
> > > > factor 'Drain' ?
> > > >
> > > > Thanks,
> > > >
> > > > François
> >
> > François Morneau
> > UMR Écologie des Forêts de Guyane
> > Kourou, French Guiana
> >
> >
>