[R] Using anova(f1, f2) to compare lmer models yields seemingly erroneous Chisq = 0, p = 1

Mon Sep 7 12:03:36 CEST 2009

On Mon, 2009-09-07 at 10:23 +0200, Viechtbauer Wolfgang (STAT) wrote:
> Are the two models (f1 and f2) actually nested?
> 
> Aside from that, it is strange that the output is exactly the same
> after you used REML=FALSE. The log likelihoods should have changed.

I might be completely misremembering, but I recall a thread in
R-SIG-Mixed where this was discussed and it was pointed out that
anova(...) on "mer" objects extracts the ML information even if fitted
using REML = TRUE.

The log likelihoods of the models supplied to 'anova' are being
extracted using REML = FALSE. So, if the above is correct, it does not
surprise me that there is no difference. 'anova' was doing the right
thing in both cases.

See ?"mer-class" for more details, then try:

logLik(f1, REML = FALSE)
logLik(f1, REML = TRUE)
logLik(f2, REML = FALSE)
logLik(f2, REML = TRUE)

'anova' is calling logLik with REML = FALSE regardless of what you
define in your model fitting call.

HTH

G

> 
> Best,
> 
> --
> Wolfgang Viechtbauer
>  Department of Methodology and Statistics
>  School for Public Health and Primary Care
>  University of Maastricht, The Netherlands
>  http://www.wvbauer.com/
> 
> 
> 
> 
> ----Original Message----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Matt Killingsworth
> Sent: Friday, September 04, 2009 22:29 To: Bert Gunter
> Cc: r-help at r-project.org
> Subject: Re: [R] Using anova(f1, f2) to compare lmer models yields
> seemingly erroneous Chisq = 0, p = 1
> 
> > Hi Bert,
> >
> > Thank you for your note!  I tried changing the REML default, and it
> > still produces the same result (see below).  Is that what you meant
> > for me to try?
> >
> > Incidentally, I am using lmer() not lme()
> >
> >  ### ORIGINAL ###
> >> f1 <- (lmer(outcome ~ predictor.1 + (1 | person), data=i))
> >> f2 <- (lmer(outcome ~ predictor.2 + (1 | person), data=i)) anova(f1,
> >> f2)
> >
> > Data: i
> > Models:
> > f1: outcome ~ predictor.1 + (1 | person)
> > f2: outcome ~ predictor.2 + (1 | person)
> >     Df    AIC     BIC    logLik Chisq Chi Df Pr(>Chisq)
> > f1  6  45443  45489 -22715
> > f2 25  47317  47511 -23633     0     19          1
> >
> > ### DO NOT USE REML ###
> >> f1 <- (lmer(outcome ~ predictor.1 + (1 | person), data=i, REML =
> >> FALSE)) f2 <- (lmer(outcome ~ predictor.2 + (1 | person), data=i,
> >> REML = FALSE)) anova(f1, f2)
> >
> > Data: i
> > Models:
> > f1: outcome ~ predictor.1 + (1 | person)
> > f2: outcome ~ predictor.2 + (1 | person)
> >     Df    AIC     BIC    logLik Chisq Chi Df Pr(>Chisq)
> > f1  6  45443  45489 -22715
> > f2 25  47317  47511 -23633     0     19          1
> >
> >
> > On Fri, Sep 4, 2009 at 4:18 PM, Bert Gunter <gunter.berton at gene.com>
> > wrote:
> >
> >> My guess would be:
> >>
> >> "Likelihood comparisons are not meaningful for objects fit using
> >> restricted maximum likelihood and with different fixed effects. "
> >>
> >> (from ?anova.lme in the nlme package).
> >>
> >> Are you using the REML = TRUE default?
> >>
> >> Bert Gunter
> >> Genentech Nonclinical Statistics
> >>
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org
> >> [mailto:r-help-bounces at r-project.org]
> >> On
> >> Behalf Of rapton
> >> Sent: Friday, September 04, 2009 9:10 AM
> >> To: r-help at r-project.org
> >> Subject: [R] Using anova(f1, f2) to compare lmer models yields
> >> seemingly erroneous Chisq = 0, p = 1
> >>
> >>
> >> Hello,
> >>
> >> I am using R to analyze a large multilevel data set, using
> >> lmer() to model my data, and using anova() to compare the fit of
> >> various models.  When I run two models, the output of each model is
> >> generated correctly as far as I can tell (e.g. summary(f1) and
> >> summary(f2) for the multilevel model output look perfectly
> >> reasonable), and in this case (see
> >> below) predictor.1 explains vastly more variance in outcome than
> >> predictor.2 (R2 = 15% vs. 5% in OLS regression, with very large N).
> >> What I am utterly puzzled by is that when I run an anova comparing
> >> the two multilevel model fits, the Chisq comes back as 0, with p =
> >> 1.  I am pretty sure that fit #1 (f1) is a much better predictor of
> >> the outcome than f2, which is reflected in the AIC, BIC , and logLik
> >> values.  Why might anova be giving me this curious output?  How can
> >> I fix it?  I am sure I am making a dumb error somewhere, but I
> >> cannot figure out what it is.  Any help or suggestions would be
> >> greatly appreciated!
> >>
> >> -Matt
> >>
> >>
> >>> f1 <- (lmer(outcome ~ predictor.1 + (1 | person), data=i)) f2 <-
> >>> (lmer(outcome ~ predictor.2 + (1 | person), data=i)) anova(f1, f2)
> >>
> >> Data: i
> >> Models:
> >> f1: outcome ~ predictor.1 + (1 | person)
> >> f2: outcome ~ predictor.2 + (1 | person)
> >>   Df    AIC      BIC    logLik   Chisq Chi Df Pr(>Chisq)
> >> f1  6  45443  45489 -22715
> >> f2 25  47317  47511 -23633     0     19          1
> >> --
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%