[R] nested mixed-effect model: variance components

Spencer Graves spencer.graves at pdf.com
Sat Jun 10 18:16:54 CEST 2006


	  I have seen no reply to this, so I will offer a couple of comments in 
spite of the fact that I know very little about "aov" other than it is 
old and has largely been superceded by "lme" in the "nlme" package. 
I've replied to many posts on random and mixed effects over the past few 
years, and I only remember one case where 'aov' returned an answer that 
could not have been obtained easily from 'lme'.  That one application 
involved estimating a saturated model from perfectly balanced 
experimental data.  'lme' refused to return an answer, at least as I 
specified the model, and 'aov' returned numbers from which anyone 
familiar with the theory for that special case could compute the desired 
F ratios and p-values.  In all other cases that I remember, 'aov' would 
either return the same answer, or much more commonly, a substantively 
inferior answer -- if it were feasible to use 'aov' at all.

	  I mention this, because the simplest way I can think of to check your 
answer is to suggest you try to work it using 'lme'.  To learn how to do 
this, I strongly encourage you to consult Pinheiro and Bates (2000) 
Mixed-Effects Models in S and S-PLUS (Springer).  Bates has been for 
many years one of the leading contributors in this area and is the 
primary author of the 'nlme', 'lme4' and 'Matrix" packages to support 
working with these kinds of models.  This book discusses both how to fit 
these kinds of models as well as how to produce plots of various kinds 
that can be very helpful for explaining the experimental results to 
others as well as diagnosing potential problem.  Moreover, the R scripts 
discussed in the book are available in files called "ch01.R", "ch02.R", 
..., "ch06.R", "ch08.R" in a subfolder '~library\nlme\scripts' of the 
directory in which R is installed on your hard drive.  This makes it 
much easier to work through the examples in the book one line at a time, 
experimenting at with modifications.  In addition, there is one point I 
know where the R syntax differs from that in the book:  S-Plus will 
accept x^2 in a formula as meaning the square of a numeric variable;  R 
will not.  To specify a square in R, you need something like I(x^2). 
When I copied the commands out of the book, I had serious trouble 
getting answers like those in the book until I identified and corrected 
this difference in syntax.  If you use the script files, they provide 
the R syntax.

	  I'm not certain, but I believe the following should estimate the 
model you wrote below:

	  fit <- lme(fixed=COVER ~ HABITAT,
		    random = ~1|LAGOON/HABITAT,
		    data=cov).

	  For comparison, I refer you to Pinheiro and Bates, p. 47, where I 
find the following:

fm1Oats <- lme( yield ~ ordered(nitro) * Variety, data = Oats,
   random = ~ 1 | Block/Variety )

	  There are 3 Varieties and 6 Blocks in this Oats data.frame.  The 
fixed effect of Variety has therefore 2 degrees of freedom.  However, 
the random effect of Variety occurs in 18 levels, being all the 6*3 
Block:Variety combinations.  You can see this by examining 'str(fm1Oats)'.

	  If you want to know "how much variation is due to lagoons? habitats? 
lagoons*habitat? transects?", this model will NOT tell you that, and I 
don't know how to answer that question using 'lme'.  I was able to 
answer a question like that using 'lmer' associated with the 'lme4' and 
'Matrix' packages.  Unfortunately, these packages have some names 
conflicts with 'nlme', and the simplest way I know to change from one to 
the other is to "q()" and restart R  Before I did, however, I made a 
local copy of the "Oats" data.frame.  After I did that, I seemed to get 
sensible numbers from the following:

library(lme4)

fm1Oats4 <- lmer(yield~ ordered(nitro) * Variety
         +(1|Block)+(1|Variety)+(1|Block:Variety),
                  data=Oats)

	  For both "lme" and "lmer", the default "method" is "REML" (restricted 
maximum likelihood).  This is what you want for estimation.  For testing 
random effects, you still want "REML", but you should adjust the degrees 
of freedom of the reference "F" distribution as discussed in section 2.4 
of Pinheiro and Bates;  this also applies to confidence intervals for 
the random effects.  For testing fixed effects, you should use "method = 
'ML'".

	  "lme4" is newer than "nlme" and does not currently have available the 
complete set of helper functions for plotting, etc.  Thus, you will 
likely want to use both.  For documentation on 'lmer', you should still 
start with Pinheiro and Bates for the general theory, then refer to the 
vignettes associated with the "mlmRev" and "lme4" packages;  if you 
don't know about vignettes RSiteSearch("graves vignette") will lead you 
to "http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html" and 
other replies to r-help where I've described how to use them.  You will 
also want the relevant article by Doug Bates in R News.  To find that, 
go "www.r-project.org" -> Documentation:  Newsletter -> Download: 
"Table of Contents", then search for Bates until you find "Douglas 
Bates. Fitting linear mixed models in R. R News, 5(1):27-30, May 2005."
That says you want vol. 5(1).

	  Hope this helps,
	  Spencer Graves

Eric Pante wrote:
> Dear listers,
> 
> I am trying to assess variance components for a nested, mixed-effects 
> model. I think I got an answer that make sense from R, but I have a 
> warning message and I wanted to check that what I am looking at is 
> actually what I need:
> 
> my data are organized as transects within stations, stations within 
> habitats, habitats within lagoons.
> lagoons: random, habitats: fixed
> the question is: how much variation is due to lagoons? habitats? 
> lagoons*habitat? transects?
> 
> Here is my code:
> 
> res <- aov(COVER ~ HABITAT + Error(HABITAT+LAGOON+LAGOON/HABITAT), 
> data=cov)
> summary(res)
> 
> and I get Sum Sq for each to calculate variance components:
> 
> Error: STRATE
>         Df Sum Sq Mean Sq
> STRATE  5 4493.1   898.6
> 
> Error: ATOLL
>            Df Sum Sq Mean Sq F value Pr(>F)
> Residuals  5 3340.5   668.1
> 
> Error: STRATE:ATOLL
>            Df  Sum Sq Mean Sq F value Pr(>F)
> Residuals 18 2442.71  135.71
> 
> Error: Within
>             Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 145 6422.0    44.3
> 
> My error message seems to come from the LAGOON/HABITAT, the Error is 
> computed.
> Warning message: Error() model is singular in: aov(COVER ~ HABITAT + 
> Error(HABITAT+LAGOON+LAGOON/HABITAT), data=cov),
> 
> THANKS !!!
> eric
> 
> 
> 
> Eric Pante
> ----------------------------------------------------------------
> College of Charleston, Grice Marine Laboratory
> 205 Fort Johnson Road, Charleston SC 29412
> Phone: 843-953-9190 (lab)  -9200 (main office)
> ----------------------------------------------------------------
> 
> 	"On ne force pas la curiosite, on l'eveille ..."
> 	Daniel Pennac
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



More information about the R-help mailing list