[R] is there a similar function to perform repeated statements asin SAS PROC MIXED?

Douglas Bates bates at stat.wisc.edu
Sun Oct 28 20:13:30 CET 2007


On 10/24/07, Dimitris Rizopoulos <dimitris.rizopoulos at med.kuleuven.be> wrote:
> you can wave a look at the gls() function in package nlme, and
> specifically at the `weights' and `correlation' arguments; the same
> arguments are also available in the lme() function.

That is certainly a way of fitting a correlation structure directly
but I don't think of that as a mixed-effects model.  To me a
mixed-effects model is a model with both fixed-effects parameters and
random effects.

It is true that SAS PROC MIXED will fit such a model but I don't think
that necessarily makes it a mixed-effects model.  I think the SAS
formulation of mixed-effects models blurs many distinctions and leads
to considerable confusion.  At least it confuses me :-)

It doesn't help that the terminology used in PROC MIXED is so sloppy
as to be nonsensical.  Gallon Li uses the SAS terminology of an
"unstructured variance-covariance matrix".  Perhaps I am being
pedantic but anyone who was not raised in a SAS-speaking environment
would, upon reflection, realize that this is asking for an
unstructured, highly-structured matrix.  It's an oxymoron.  Any
definition I have ever seen of the variance-covariance matrix of a
random vector requires that the matrix be symmetric (hence square) and
at least positive semi-definite, if not positive-definite.

Even in the one-dimensional case I would interpret an unstructured
variance matrix to be an unconstrained 1 by 1 matrix whose element is
required to be non-negative.  I have difficulty grasping the concept
of an unstructured matrix that is required to satisfy several
complicated constraints.

If the term had been a "general variance-covariance matrix" I think I
could understand what it was supposed to mean.

The REPEATED statement propagates the dangerous misconception that
there is a single model which is appropriate for any case of repeated
measures.   The approach I would recommend for a person using R is to
plot the data, formulate a preliminary model, fit the model, examine
the residuals, modify the model if appropriate, rinse and repeat.  In
the days when SAS was being developed the "rinse and repeat" part
could take several days so it made sense to try a "one size fits all"
model.  I don't feel that makes sense any more.

Let me describe what I understand the REPEATED model with an
"unconstrained" variance-covariance matrix to mean.  If I have it
wrong I hope you or others reading this will correct me.

Suppose the observations are indexed by subject and occasion.  For
definiteness, consider the first 5 days (Days 0 to 4) of data from the
sleepstudy data in the lme4 package.  Convert the Days variable to a
factor, which will will call occ for occasion, and incorporate the
indicators for the occasion in both the fixed effects and the random
effects indexed by subject.

> sleep04 <- subset(sleepstudy, Days < 5)
> sleep04$occ <- factor(sleep04$Days)
> show(fm1 <- lmer(Reaction ~ occ - 1 + (occ - 1|Subject), sleep04))
Linear mixed-effects model fit by REML
Formula: Reaction ~ occ - 1 + (occ - 1 | Subject)
   Data: sleep04
 AIC BIC logLik MLdeviance REMLdeviance
 808 858   -384      792.7          768
Random effects:
 Groups   Name Variance Std.Dev. Corr
 Subject  occ0  987.132 31.4187
          occ1 1072.423 32.7479  0.769
          occ2  823.516 28.6970  0.493 0.808
          occ3 1464.770 38.2723  0.481 0.769 0.913
          occ4 1764.239 42.0028  0.465 0.673 0.722 0.939
 Residual        45.184  6.7219
Number of obs: 90, groups: Subject, 18

Fixed effects:
     Estimate Std. Error t value
occ0  256.652      7.573   33.89
occ1  264.496      7.880   33.57
occ2  265.362      6.947   38.20
occ3  282.992      9.159   30.90
occ4  288.649     10.026   28.79

Correlation of Fixed Effects:
     occ0  occ1  occ2  occ3
occ1 0.737
occ2 0.470 0.770
occ3 0.464 0.741 0.875
occ4 0.449 0.651 0.694 0.914

This model requires estimation of  15 variance-covariance parameters
using data from only 18 subjects.  You can fit such a model to more
that 5 occasions from these data but I'm sure the fits are badly
overparameterized.

Let me emphasize that I am not criticizing you for your perfectly
reasonable answer to Gallon Li's question, nor am I criticizing Gallon
Li for the question.  I am simply expressing my frustration at the
sloppy terminology and "one size fits all" mentality implicit in
saying that this model is "the" repeated measures model.

As an example of why I say that the REPEATED statement leads to
dangerous misconceptions, I'll give a bit of the story of the PBG data
set from the nlme package.   The data came from an experiment by a
cardiologist who was quite knowledgeable regarding various statistical
techniques.  He used these data in a tutorial paper about repeated
measures analysis published in a cardiology journal.  The repeated
measures analysis was somewhat disappointing in that it did not show a
significant effect for treatment.  However a plot of the data, such as
Fig. 3.4 in Pinheiro and Bates (2000), which can be reproduced by

data(PBG, package = "nlme")
require(lattice)
xyplot(deltaBP ~ log(dose) | Rabbit, PBG, groups = Treatment,
      type = c("g", "b"), aspect = 1, auto.key = list(lines = TRUE,
      space = "top", columns = 2))

shows a strong effect for treatment.

What happened?  Well, the effect for treatment is not of the sort that
would show up in the model being fit by "the" repeated measures model.
 You can see in that figure that the effect for treatment is to shift
the sigmoidal response curve for each rabbit horizontally but "the"
repeated measures model doesn't allow for that.  If you fit an
appropriate model you do indeed see a strong effect for treatment.

So, yes, I suppose you can emulate the effect of the REPEATED
statement within lme or lmer but I wouldn't advise doing so.


> ----- Original Message -----
> From: "gallon li" <gallon.li at gmail.com>
> To: "r-help" <r-help at stat.math.ethz.ch>
> Sent: Wednesday, October 24, 2007 2:43 PM
> Subject: [R] is there a similar function to perform repeated
> statements asin SAS PROC MIXED?
>
>
> > PROC MIXED is used to fit mixed effects model for correlated data.
> >
> > Usually we can use either a REPEATED statment or a RANDOM statement.
> >
> > The random statement is corresponding to lme function in R --
> > specifying a
> > random effect term.
> >
> > The repeated statement actually directly specifies the covariance
> > structure
> > -- is there a similar function in R to do this? I currently want to
> > specify
> > a unstructured matrix. (Of course if i just want to use compound
> > symmetry, i
> > know i can still use lme.)
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list