[R] lme4 and Variable level detection

Sat Feb 28 19:25:16 CET 2009

On Sat, Feb 28, 2009 at 9:00 AM, Jeroen Ooms <j.c.l.ooms at uu.nl> wrote:

> I am making a little GUI for lme4, and I was wondering if there is a function
> that automatically detects on which level every variable exists.
> Furtheremore I got kind of confused about what a random effects model
> actually calculates.

Questions such as this may be answered more quickly if you send them
to the R-SIG-Mixed-Models mailing list, which I am cc:ing on this
reply.

> I have some experience with commercial software packages for multilevel
> analysis, like HLM6, and I was surprised that lme4 does not require the user
> to specify the level for every predictor variable. Is this because the
> function automatically detects the level by testing on which levels the
> predictor has variance, or is this information simply not needed?

In some ways, exposure to software like HLM or MLWin can be more of a
hindrance than a help when learning about mixed models.  In
presentation of the model and in the software itself these packages
emphasize "levels" of random effects leading to the impression that we
can only associate random effects with factors that are nested.  This
is a misconception.  There are many cases where is it eminently
sensible to associate random effects with factors that are completely
crossed ('subject' and 'item' are a prime example) or partially
crossed.  The archetypal example used in multilevel modeling,
achievement scores on students nested in classes nested in schools
nested in ..., becomes partially crossed when we track students over
time and they move from class to class or school to school.

I imagine that the reason for defining the model in terms of nested
factors for random effects is computational.  If you insist that the
random effects must always be defined with respect to nested factors
then you can employ methods that take advantage of this, with
considerable simplification in the storage and computational burden.
The lme4 package adopts a different approach based on sparse matrix
storage and decomposition methods.  It turns out that these methods
are competitive with the best methods for models based on nested
factors, in the cases to which they apply, and these methods allow for
fitting much more general models.

An unfortunate side-effect of the emphasis on levels in MLWin and HLM
is the perception that other covariates must be characterized by the
level at which they vary, even if these covariates only determine
fixed-effects parameters.  This is quite untrue and misleading.  The
only constraints on the covariates and the model matrix for the
fixed-effects parameters is that the model matrix must be of full
column rank.  In models that define random effects for slopes, or in
general for the coefficients associated with a covariate, the
constraint is that the covariate cannot be constant within each level
of the grouping factor of the random effect.  For example, we cannot
estimate a random effect for the coefficients for sex (M/F) within
subject (assuming we do not have transgender people in the study).

My advice would be to avoid phrasing the model in terms of levels of
random effects.  Although I realize that those with a background of
using MLWin or HLM may find this more comfortable, I think it would be
propagating bad practices and misconceptions.

> I was taught that a crosslevel interaction predicts the regression
> coefficient of the lower level variable, which is also what is implied by
> the HLM gui. However, in an lme4 formula, a crosslevel interaction has the
> same syntax as a regular interaction term. Furthermore, lme4 also allows
> adding crosslevel interactions without a random slope for the lower level
> variable. Now I'm confused. Is there a fundamental difference between a
> crosslevel interaction, or is the same thing as a regular interaction when
> the model also holds an error term for the lower level variable?
>
>
>
>
> -----
> Jeroen Ooms * Dept. of Methodology and Statistics * Utrecht University
>
> Visit  http://www.jeroenooms.com www.jeroenooms.com  to explore some of my
> current projects.
>
>
>
>
>
>
> --
> View this message in context: http://www.nabble.com/lme4-and-Variable-level-detection-tp22262944p22262944.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>