[R] lmer() vs. lme() gave different variance component estimates

Wed Sep 22 00:20:01 CEST 2010

On Tue, Sep 21, 2010 at 4:21 PM, Peter Dalgaard <pdalgd at gmail.com> wrote:
> On 09/21/2010 09:02 PM, Douglas Bates wrote:
>> On Tue, Sep 21, 2010 at 1:39 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
>>> I haven't had the time to keep up with this discussion, or many of the
>>> other discussions on the R-SIG-Mixed-Models email list.  I swamped
>>> with other duties at present.
>
> It's not like I don't know how you feel....
>
>
>>>
>>> It is important to remember that the nlme and lme4 packages take a
>>> model specification and provide code to evaluate the deviance.  Other
>>> code is used to optimize the deviance.  In the case of nlme it is the
>>> nlminb function and in the case of lme4 it is the minqa function.
>>
>> Shouldn't send email when I'm tired.  The optimizer function is bobyqa
>> in the minqa package. (Sort of makes you nostalgic for the punched
>> cards and line printers when you see the Fortran names jammed into six
>> characters.)
>>
>>> There are no guarantees for numerical optimization routines.  If the
>>> problem is ill-conditioned then they can converge to "optima" that are
>>> not the global optimum.
>
> Right. However, what we have here is a case where I'm pretty damn sure
> that the likelihood function is unimodal (it's a linear
> reparametrization of three independent chi-square terms) and has an
> optimum in the interior of the feasible region.
>
> In any case, I'm thoroughly confused about the lme4/lme4a/lme4b
> subtrees. Which algorithm is used by the current CRAN lme4?? (I have

lme4b was a placeholder.  It should be ignored now.

lme4a is the development version that will become the released version
once Martin and I are convinced that it does all the things that lme4
does.  Before that can happen I have to get my head above water for a
week or two to catch up and it could be well into October before that
happens (I have a particularly heavy teaching load this semester).

lme4a is cleaner than lme4 but leans heavily on the Rcpp package to
achieve that.  So it uses classes and methods (although in different
senses) at both the R and the compiled code levels.  lme4a allows for
profiling the deviance in a LMM with respect to the parameters.

I had thought that the released version of lme4 (the one shown below)
allowed for selection of the optimizer but, "upon further review" (a
phrase known to fans of what Americans call "football") it turns out
that it doesn't.  So I misspoke.  The released version of lme4 uses
nlminb, as shown by the form of the verbose output.

The lme4a version of lmer uses bobyqa.  It would probably be
worthwhile running this test case in that version.

>> sessionInfo()
> R version 2.11.1 (2010-05-31)
> i386-redhat-linux-gnu
>
> locale:
>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>  [7] LC_PAPER=en_US.utf8       LC_NAME=C
>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] lme4_0.999375-35   Matrix_0.999375-39 lattice_0.18-8
>
> loaded via a namespace (and not attached):
> [1] grid_2.11.1   nlme_3.1-96   stats4_2.11.1 tcltk_2.11.1  tools_2.
>
> )
>
>
>
>>>
>>> Whoever suggested using the verbose option to see the progress of the
>>> iterations has the right idea.
>>>
>>>
>>> On Mon, Sep 20, 2010 at 2:50 PM, array chip <arrayprofile at yahoo.com> wrote:
>>>> Thank you Peter and Ben for your comments.
>>>>
>>>> John
>>>>
>>>>
>>>> ----- Original Message ----
>>>> From: Peter Dalgaard <pdalgd at gmail.com>
>>>> To: array chip <arrayprofile at yahoo.com>
>>>> Cc: r-help at r-project.org; r-sig-mixed-models at r-project.org
>>>> Sent: Mon, September 20, 2010 12:28:43 PM
>>>> Subject: Re: [R] lmer() vs. lme() gave different variance component estimates
>>>>
>>>> On 09/20/2010 08:09 PM, array chip wrote:
>>>>> Thank you Peter for your explanation of relationship between aov and lme. It
>>>>> makes perfect sense.
>>>>>
>>>>>
>>>>> When you said "you might have computed the average of all 8
>>>>> measurements on each animal and computed a 1-way ANOVA" for treatment effect,
>>>>> would this be the case for balanced design, or it is also true for unbalanced
>>>>> data?
>>>>
>>>> It is only exactly true for a balanced design, although it can be a
>>>> practical expedient in nearly-balanced cases, especially if there is a
>>>> clearly dominant animal variation. In strongly unbalanced data, you get
>>>> reduced efficiency because animals with less data should be downweighted
>>>> (not proportionally if there is substantial between-animal variation,
>>>> though). And of course the whole thing relies on the fact that you have
>>>> individuals nested in treatment (no animals had multiple treatments)
>>>>
>>>>>
>>>>> Another question is if 1-way ANOVA is equivalent to mixed model for testing
>>>>> treatment effect, what would be reason why mixed model is used? Just to
>>>>> estimate
>>>>>
>>>>> the variance components? If the interest is not in the estimation of variance
>>>>> components, then there is no need to run mixed models to test treatment
>>>>> effects?
>>>>
>>>> Not too far off the mark. In more complex cases, there is the advantage
>>>> that the mixed model helps figure out a sensible analysis for you.
>>>>
>>>>
>>>>> And my last question is I am glad to find that glht() from multcomp package
>>>>> works well with a lmer() fit for multiple comparisons. Given Professor Bates's
>>>>
>>>>> view that denominator degree's of freedom is not well defined in mixed models,
>>>>
>>>>> are the results from glht() reasonable/meaningful? If not, will the suggested
>>>>> 1-way ANOVA used together with glht() give us correct post-hoc multiple
>>>>> comparsion results?
>>>>
>>>> I think Doug's view is that DFs are not _reliably_estimated_ with any of
>>>> the current procedures. In the balanced cases, they are very well
>>>> defined (well, give or take the issues with "negative variances"), and I
>>>> would expect glht() to give meaningful results. Do check the residuals
>>>> for at least approximate normality, though.
>>>>
>>>>
>>>>>
>>>>> Thank you very much!
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message ----
>>>>> From: Peter Dalgaard <pdalgd at gmail.com>
>>>>> To: array chip <arrayprofile at yahoo.com>
>>>>> Cc: r-help at r-project.org; r-sig-mixed-models at r-project.org
>>>>> Sent: Sat, September 18, 2010 1:35:45 AM
>>>>> Subject: Re: [R] lmer() vs. lme() gave different variance component estimates
>>>>>
>>>>>
>>>>> For a nested design, the relation is quite straightforward: The residual
>>>>> MS are the variances of sample means scaled to be comparable with the
>>>>> residuals (so that in the absense of random components, all
>>>>> MS are equal to within the F-ratio variability). So to get the id:eye
>>>>> variance component, subtract the Within MS from the id:eye MS and divide
>>>>> by the number of replicates (4 in this case since you have 640
>>>>> observations on 160 eyes) (14.4 - 0.01875)/4 = 3.59, and similarly, the
>>>>> id variance is the MS for id minus that for id:eye scaled by 8:
>>>>> (42.482-14.4)/8 = 3.51.
>>>>>
>>>>> I.e. it is reproducing the lmer results above, but of course not those
>>>>> from your original post.
>>>>>
>>>>> (Notice, by the way, that if you are only interested in the treatment
>>>>> effect, you might as well have computed the average of all 8
>>>>> measurements on each animal and computed a 1-way ANOVA).
>>>>>
>>>>
>>>>
>>>> --
>>>> Peter Dalgaard
>>>> Center for Statistics, Copenhagen Business School
>>>> Phone: (+45)38153501
>>>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>