[R] How to Code Random Nested Variables within Two-way Fixed Model in lmer or lme

Fri Sep 30 09:46:00 CEST 2011

Dave, your situation is clearer now. You wrote (see the full context at
the end of the message):

>  From this, you will see that I have 4 control sites and 7 treatment

> sites that are measured each week.  All 13 locations have different 
> names, and Location is a random varaible.  Is Location nested within

> Habitat?  I thought it was, but maybe I am wrong.  Perhaps it is a 
> random variable that is not nested?

I think so, or at least that nesting is irrelevant. If all factors were
fixed, to specify a nesting of Location within Habitat, you should write
the term "Habitat/Location", but that is equal to "Habitat +
Habitat:Location", and since there cannot be two different values of
Habitat for the same Location, that's exactly the same as "Habitat +
Location". So you can safely separate both terms.

Moreover, the different type of effect for Habitat and Location makes
the nesting notation unsuitable, because Habitat is a fixed effect,
whereas Habitat:Location (i.e. Location) is random. If you put
"Habitat/Location" in the "fixed" part of the formula, both terms would
be treated as fixed, and if you put it in the "random" part, both would
be treated as random, and you don't want that, I suppose.

> My main goal is to look for an effect of Habitat.  But if there is a

> significant Week x Habitat interaction, I would examine the effect of

> Habitat separately for each Week.
> 
> Hopefully, the above helps to clarify my situation.  I should
re-state, 
> I would like to use an lmer or lme syntax to properly analyze these 
> data, especially given that they are counts, I would like to try
family 
> = poisson or quasipoisson.

If you need a generalized linear model, you can try lmer (in package
lme4), since it supports the "family" argument where you can specify the
type of error distribution. On the other hand, lme (in package nlme)
only considers normal errors.

Regarding the formula, I like building formulas term by term, asking
myself if each possible term has a potential effect, and including it in
the model if I can answer "yes". From what you have said, I can infer
that you do think that there may be a Week:Habitat interaction, so that
term must be in your formula. Now:

1. Do you think that the habitat may influence your outcome (the
counts), regardless of the other factors? I guess you do, so let's
include Habitat as well.

2. Do you think that Week may influence the counts, regardless of the
other factors?

  - 2.1. If so, the fixed part of your formula would be Habitat*Week (=
Habitat + Week + Week:Habitat)

  - 2.2. Otherwise, it would be Habitat/Week (= Habitat +
Habitat:Week)

As already commented on, the random part would just be Location. In
theory, since each location is measured in various weeks, you might
consider that the (fixed) effect of Week could be influenced by the
random Location as well, and in that case you would have an additional
random term, the Location:Week interaction. (I.e., you could write the
random term as "Location/Week".) However, in your data set there is only
one observation for each value of Location:Week, so it would be
impossible to distinguish that random term from the residual error, and
you may just omit it.

All in all, you can try:

m. <- lmer(CO ~ Habitat*Week + (1|Location), family=poisson)

or if Week is only relevant for different types of habitat:

m. <- lmer(CO ~ Habitat/Week + (1|Location), family=poisson)

I must admit that I'm not used to analyse generalized linear models, so
I don't know if that approach is correct, but I'd say that's the code to
do what you asked for.

Now, the bad news is that perhaps you are expecting to get p-values
from anova(m.), but you won't get it. Douglas Bates explained why here:
https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html 

On the other hand, you can get p-values from Anova (from package car),
instead of anova, but I don't entirely understand the calculations of
that approach.

Helios

>>> El día 29/09/2011 a las 20:49, Dave Robichaud <drobichaud at lgl.com>
escribió:
> Hi again,
> 
> Thank you very much for taking the time to respond to my question.  I
am 
> sorry that my explanation was confusing.  Please allow me to try to
clarify.
> 
> First, please ignore my attempts to define a lmer model.  By putting

> forward my best first guess, which was clearly wrong, I have only
served 
> to confuse matters.  My goal here is to get advice on how to
formulate 
> the correct lmer model.  Hopefully someone can help with that.
> 
> I should describe my data in more detail.  I have the following
columns:
> 
> Location    Habitat    Week     CO
> 1           Control    1        10
> 2           Control    1        12
> 3           Control    1         0
> 4           Control    1         5
> 5           Treatment  1        10
> 6 Treatment 1         7
> 7 Treatment  1         8
> 8 Treatment  1         6
> 9 Treatment  1         0
> 10 Treatment  1         5
> 11 Treatment  1         3
> 12 Treatment  1         12
> 13 Treatment  1         0
> ...    (9 weeks of data omitted to save space)
> 1           Control    11         9
> 2           Control    11         8
> 3           Control    11         3
> 4           Control    11         6
> 5           Treatment  11         9
> 6 Treatment 11         6
> 7 Treatment  11         5
> 8 Treatment  11        10
> 9 Treatment  11         2
> 10 Treatment  11         4
> 11 Treatment  11         6
> 12 Treatment  11         9
> 13 Treatment  11         2
> 
>  From this, you will see that I have 4 control sites and 7 treatment

> sites that are measured each week.  All 13 locations have different 
> names, and Location is a random varaible.  Is Location nested within

> Habitat?  I thought it was, but maybe I am wrong.  Perhaps it is a 
> random variable that is not nested?
> 
> My main goal is to look for an effect of Habitat.  But if there is a

> significant Week x Habitat interaction, I would examine the effect of

> Habitat separately for each Week.
> 
> Hopefully, the above helps to clarify my situation.  I should
re-state, 
> I would like to use an lmer or lme syntax to properly analyze these 
> data, especially given that they are counts, I would like to try
family 
> = poisson or quasipoisson.
> 
> Thanks again,
> 
> Dave

[Copy of previous posts snipped off. See the previous part history of
this thread in:
https://stat.ethz.ch/pipermail/r-help/2011-September/291178.html ]

INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.