[R] Covariates in LME?

Douglas Bates bates at stat.wisc.edu
Thu Mar 27 21:44:44 CET 2008


On Thu, Mar 27, 2008 at 7:01 AM, Aberg Carl <kristoffer.aberg at epfl.ch> wrote:
> Hi,
>  Im using lme to calculate a mixed factors ANOVA according to:

>  px_anova = anova(lme(dep~music*time*group, random = ~1|id, data = px_data))

>  where

>  dep is a threshold,
>  time is a repeated measures variable (2 levels)
>  group is a between subjects variable (2 levels)
>  id is a random factor (subject id)
>  music is a between subjects variable (2 levels) indicating if a person has a musical experience, or not

>  Musical experience is now decided by categorizing depending on the number of years practicing playing an instrument.

>  I would like to use the years of playing an instrument as a covariate instead of creating categories.

Hmm.  Your question can be answered on the level of tactics (i.e. an
immediate response to the question that was asked) or on the level of
strategy (considering why are you asking the question in the first
place).

The tactics answer is just to use the numeric variable, say 'years',
instead of the factor 'music'.  The formula language for linear models
in R is very flexible and is described in many of the books that are
listed on the "Books" link at www.r-project.org

The strategy answer would address the question of why you are writing
the model as dep ~ music*time*group and why you don't save the fitted
model but instead immediately pass it to the anova function.  It seems
that you are approaching the problem as a special type of ANOVA
problem so the only items of interest are the F statistics and
p-values in an ANOVA table.  The more common approach in R is to model
your data, first by plotting the data so you can formulate an initial
model, then fitting that model, examining residual plots and other
diagnostics, and modifying the model if indicated. Only after that
process has converged on a model that seems reasonable does one
calculate inferential statistics such as p-values.

The inferential statistics are always based on mathematical models of
the data and will be misleading unless the model is appropriate.  The
model is never "correct".  As George Box famously said, "All models
are wrong; however, some models are useful."



More information about the R-help mailing list