[R] nlme: problem with fitting logistic function

Mon Mar 9 22:52:36 CET 2009

On Mon, Mar 9, 2009 at 12:13 PM, Dieter Menne
<dieter.menne at menne-biomed.de> wrote:
> jakub kreisinger <jakubkreisinger <at> seznam.cz> writes:
>
>> I am trying to analyze growth data on mice. To do this I attempted to fit
> logistic curve using nlme package.
>> However, the dataset I use is large (in total ca 20 000 measures on ca 3 000
> individuals) with relatively
>> complicated structure (several explanatory variables with interactions +
> random effect where
>> individual offspring are nested within particular litters are nested within
> particular parental
>> pairs).
>> Although I had no problems to fit models, where the complexity of random
> effect was reduce (which is
>> conceptually incorrect), the fitting procedure of the full model did not reach
> the 1 iteration after
>> several hours. Do you have any idea how to solve this problem?
>
> 1) Get better starting values by using nlsList
> 2) Get still better starting values by fitting a simpler model first with
>   nlme. This is VERY successful for me, it made many problems feasibly that
>   blew up otherwise. Good luck in finding the right syntax for complex
>   start values, this can be a huge challenge.
> 3) Use lmer in lme4. Your mileage may vary, I could not find a speedup
>   for my problems, but larger problem might give one.

Did you mean nlmer in the lme4 package?  If so, it may be worthwhile
trying the development branch but that is not something for the
faint-hearted.

> 4) Use C for the core function. This is very effective, and there is at least
>   on example coming with nlme (was it SSlogist?).

Do you think that evaluation of the model function takes a substantial
portion of the computing time?  I am asking for my interest, not
because I think I know the answer.  So, for example, have you profiled
difficult nlme fits and found that the model function evaluation was
expensive?

I did find a similar result when fitting generalized linear mixed
models - a substantial portion of the execution time was spent in the
evaluation of the inverse link function and its derivative - so I
moved that to compiled code.  Interestingly in those cases it wasn't
the evaluation of the function as much as checking the boundary
conditions that was taking up time.

I wouldn't recommend patterning such code after nlme_one_comp_first in
nlme.c  I would use the .Call interface to C code instead.  The real
trick would be applying the recycling rule inside the C code without
tying yourself in knots.  A function like SSlogis (there is no "t" in
the name, by the way) takes an "input" argument and three parameters,
"Asym", "xmid" and "scal".   The "input" argument always has a length
which is the number of observations but "Asym", "xmid" and "scal" are
sometimes scalars and sometimes vectors of the same length as "input".

If you would be willing to run some tests, I will volunteer to write
the code and we can see if it helps.  (Not to say that you couldn't
write it yourself if you were so inclined but I do this a lot and
could probably get something working relatively quickly - why do
those sound like "famous last words"?)