[R] What is the degrees of freedom in an nlme model

Tue Jul 13 00:56:46 CEST 2010

Hi, Bert,

Thanks for your thoughtful explanation. I think the problem is quite
over my head and maybe I should leave "how" for experts :)

The situation is I have a group of sigmoid curves (let's say, they are
supposed to be the same) but occasionally you will see a few curves
kind of different. So how do we say they are actually different or not
from the majority curves in a statistical way? The original idea was
proposed by Monson and Rodbard in 1978 (Am J Physiol. 1978
Aug;235(2):E97-102). The paper is freely available. The idea is to fit
the curves individually and obtain the residual sum of squares and
then fit the curves altogether somehow constraining some parameters
and then you have another residual sum of squares. Then you can do a
F-test. In my case, I wonder if I can use a mixed-effect modeling to
do the simultaneous fitting job. Now you see, the problem is the
degrees of freedom. Based on your explanation, it seems no reliable
calculation of df for nonlinear models. However I can still see the df
reported in nlme or nls models. Now I am not even sure if I should use
them.

Another thing I observed is even I added more random effects to the
nlme model, the denominator df did not seem to change. Is it correct?
Thanks again.

Jun

On Mon, Jul 12, 2010 at 4:00 PM, Bert Gunter <gunter.berton at gene.com> wrote:
> Jun:
>
> Short answer: There is no such thing as df for a nonlinear model (whether or
> not mixed effects).
>
> Longer answer: df is the dimension of the null space when the data are
> projected on the linear subspace of the model matrix of a **linear model **
> . So, strictly speaking, no linear model, no df.
>
> HOWEVER... nonlinear models are usually (always??) fit by successive linear
> approximations, and approximate df are obtained from these approximating
> subspaces.
>
> However, the problem with this is that there is no guarantee that the
> relevant residual distributions are sufficiently chisq with the approximate
> df to give reasonable answers. In fact, lots of people much smarter than I
> have spent lots of time trying to figure out what sorts of approximations
> one should use to get trustworthy results. The thing is, in nonlinear
> models, it can DEPEND on the exact form of the model -- indeed, that's what
> distinguishes nonlinear models from linear ones! So this turns out to be
> really hard and afaik these smart people don't agree on what should be done.
>
>
> To see what one of the smartest people have to say about this, search the
> archives for Doug Bates's comments on this w.r.t. lmer (he won't compute
> such distributions nor provide P values because he doesn't know how to do it
> reliably. Doug -- please correct me if I have it wrong).
>
> A stock way to extricate oneself from this dilemma is: bootstrap!
> Unfortunately, this is also probably too facile: for one thing, with a
> nondiagonal covariance matrix (as in mixed effects models), how do you
> resample to preserve the covariance structure? I believe this is an area of
> active research in the time series literature, for example. For another,
> this may be too computationally demanding to be practicable due to
> convergence issues.
>
> Bottom line: there may be no good way to do what you want.
>
> Note to experts: Please view this post as an invitation to correct my errors
> and provide authoritative info.
>
> Cheers to all,
>
> Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Jun Shen
> Sent: Monday, July 12, 2010 12:34 PM
> To: R-help
> Subject: [R] What is the degrees of freedom in an nlme model
>
> Dear all,
>
> I want to do a F test, which involves calculation of the degrees of
> freedom for the residuals. Now say, I have a nlme object "mod.nlme". I
> have two questions
>
> 1.How do I extract the degrees of freedom?
> 2.How is this degrees of freedom calculated in an nlme model?
>
> Thanks.
>
> Jun Shen
>
> Some sample code and data
> =================================================================
> mod.nlme<-nlme(RESP~E0+(Emax-E0)*CP**gamma/(EC50**gamma+CP**gamma),data=Data
> ,
> fixed=E0+Emax+gamma+EC50~1,
> random=list(pdDiag(EC50+E0+gamma~1)),
> groups=~ID,
> start=list(fixed=c(E0=1,Emax=100,gamma=1,b=50))
> )
>
> The "Data" object
> structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
> 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
> 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
> 8L, 8L, 8L), CP = c(1, 2, 3, 4.5, 5, 7.5, 11.25, 12, 18, 30,
> 45, 60, 90, 120, 150, 200, 1, 2, 3, 4.5, 5, 7.5, 11.25, 12, 18,
> 30, 45, 60, 90, 120, 150, 200, 1, 2, 3, 4.5, 5, 7.5, 11.25, 12,
> 18, 30, 45, 60, 90, 120, 150, 200, 1, 2, 3, 4.5, 5, 7.5, 11.25,
> 12, 18, 30, 45, 60, 90, 120, 150, 200, 1, 2, 3, 4.5, 5, 7.5,
> 11.25, 12, 18, 30, 45, 60, 90, 120, 150, 200, 1, 2, 3, 4.5, 5,
> 7.5, 11.25, 12, 18, 30, 45, 60, 90, 120, 150, 200, 1, 2, 3, 4.5,
> 5, 7.5, 11.25, 12, 18, 30, 45, 60, 90, 120, 150, 200, 1, 2, 3,
> 4.5, 5, 7.5, 11.25, 12, 18, 30, 45, 60, 90), RESP = c(3.19, 2.52,
> 2.89, 3.28, 3.82, 7.15, 11.2, 16.25, 30.32, 55.25, 73.56, 82.07,
> 89.08, 95.86, 97.97, 99.03, 3.49, 4.4, 3.54, 4.99, 3.81, 10.12,
> 21.59, 24.93, 40.18, 61.01, 78.65, 88.81, 93.1, 94.61, 98.83,
> 97.86, 0.42, 0, 2.58, 5.67, 3.64, 8.01, 12.75, 13.27, 24.65,
> 46.1, 65.16, 77.74, 87.99, 94.4, 96.05, 100.4, 2.43, 0, 6.32,
> 5.59, 8.48, 12.32, 26.4, 28.36, 43.38, 69.56, 82.53, 91.36, 95.37,
> 98.36, 98.66, 98.8, 5.16, 2, 5.65, 3.48, 5.78, 5.5, 11.55, 8.53,
> 18.02, 38.11, 58.93, 70.93, 85.62, 89.53, 96.19, 96.19, 2.76,
> 2.99, 3.75, 3.02, 5.44, 3.08, 8.31, 10.85, 13.79, 32.06, 50.22,
> 63.7, 81.34, 89.59, 93.06, 92.47, 3.32, 1.14, 2.43, 2.75, 3.02,
> 5.4, 8.49, 7.91, 15.17, 35.01, 53.91, 68.51, 83.12, 86.85, 92.17,
> 95.72, 3.58, 0.02, 3.69, 4.34, 6.32, 5.15, 9.7, 11.39, 23.38,
> 42.9, 61.91, 71.82, 87.83)), .Names = c("ID", "CP", "RESP"), class =
> "data.frame", row.names = c(NA,
> -125L))
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>