[R] Re: R-help Digest, Vol 18, Issue 12

Thu Aug 12 14:49:31 CEST 2004

The message for aov1 was "Estimated effects <may> be unbalanced".  The  
effects are not unbalanced.  The design is 'orthogonal'.

The problem is that there are not enough degrees of freedom to estimate  
all those error terms.  If you change the model to:
   aov1 <-  
aov(RT~fact1*fact2*fact3+Error(sub/(fact1+fact2+fact3)),data=myData)

or to

   aov2 <-  
aov(RT~fact1*fact2*fact3+Error(sub/ 
((fact1+fact2+fact3)^2)),data=myData)

all is well.  This last model (aov2) seems to me to have an excessive  
number of error terms.

The lme model lme(RT~fact1*fact2*fact3, random=~1|sub, data=myData)
is equivalent to aov0 <- aov(RT~fact1*fact2*fact3+Error(sub),  
data=myData)
It can be verified that the estimated variance components can be used  
to reproduce the mean squares in the anova table.

A conservative approach is to estimate e.g. contrasts involving fact1  
for each subject separately, then comparing these against SE estimates  
that have 4df (5 subjects -1).  This is the ultimate check -- are  
claimed effects consistent against the 5 subjects?  The standard errors  
should though, probably be based on some variety of averaged variance.

I do not know how to set up the equivalents of aov1 and aov2 (where the  
errors are indeed crossed) in lme4.

John Maindonald.

On 12 Aug 2004, at 8:03 PM, r-help-request at stat.math.ethz.ch wrote:

> I will follow the suggestion of John Maindonald and present the  
> problem by example with some data.
>
> I also follow the advice to use mean scores, somewhat reluctantly  
> though. I know it is common practice in psychology, but wouldn’t it be  
> more elegant if one could use all the data points in an analysis?
>
> The data for 5 subjects (myData) are provided at the bottom of this  
> message. It is a crossed within-subject design with three factors and  
> reaction time (RT) as the dependent variable.
>
> An initial repeated-measures model would be:
> aov1<-aov(RT~fact1*fact2*fact3+Error(sub/ 
> (fact1*fact2*fact3)),data=myData)
>
> Aov complains that the effects involving fact3 are unbalanced:
> > aov1
> ....
> Stratum 4: sub:fact3
> Terms:
>                      fact3   Residuals
> Sum of Squares  4.81639e-07 5.08419e-08
> Deg. of Freedom           2           8
>
> Residual standard error: 7.971972e-05
>
> 6 out of 8 effects not estimable
> Estimated effects may be unbalanced
> ....
> Presumably this is because fact3 has three levels and the other ones  
> only two, making this a 'non-orthogonal’ design.
>
> I then fit an equivalent mixed-effects model:
> lme1<-lme(RT~fact1*fact2*fact3,data=meanData2,random=~1|sub)
>
> Subsequently I test if my factors had any effect on RT:
> > anova(lme1)
>                  numDF denDF   F-value p-value
>
> (Intercept)           1    44 105294024  <.0001
> fact1                 1    44        22  <.0001
> fact2                 1    44         7  0.0090
> fact3                 2    44        19  <.0001
> fact1:fact2           1    44         9  0.0047
> fact1:fact3           2    44         1  0.4436
> fact2:fact3           2    44         1  0.2458
> fact1:fact2:fact3     2    44         0  0.6660
>
> Firstly, why are the F-values in the output whole numbers?
>
> The effects are estimated over the whole range of the dataset and this  
> is so because all three factors are nested under subjects, on the same  
> level. This, however, seems to inflate the F-values compared to the  
> anova(aov1) output, e.g.
> >  anova(aov1)
> ....
> Error: sub:fact2
>          Df     Sum Sq    Mean Sq F value Pr(>F)
> fact2      1 9.2289e-08 9.2289e-08  2.2524 0.2078
> Residuals  4 1.6390e-07 4.0974e-08
> ....
>
> I realize that the (probably faulty) aov model may not be directly  
> compared to the lme model, but my concern is if the lme estimation of  
> the effects is right, and if so, how can a naïve skeptic be convinced  
> of this?
>
> The suggestion to use simulate.lme() to find this out seems good, but  
> I can’t seem to get it working (from "[R] lme: simulate.lme in R" it  
> seems it may never work in R).
>
> I have also followed the suggestion to fit the exact same model with  
> lme4. However, format of the anova output does not give me the  
> estimation in the way nlme does. More importantly, the degrees of  
> freedom in the denominator don’t change, they’re still large:
> > library(lme4)
> > lme4_1<-lme(RT~fact1*fact2*fact3,random=~1|sub,data=myData)
> > anova(lme4_1)
> Analysis of Variance Table
>
>                     Df    Sum Sq   Mean Sq Denom F value    Pr(>F)   
> fact1I                1 2.709e-07 2.709e-07    48 21.9205 2.360e-05  
> ***
> fact2I                1 9.229e-08 9.229e-08    48  7.4665  0.008772 **
> fact3L                1 4.906e-08 4.906e-08    48  3.9691  0.052047 .
> fact3M                1 4.326e-07 4.326e-07    48 34.9972 3.370e-07 ***
> fact1I:fact2I         1 1.095e-07 1.095e-07    48  8.8619  0.004552 **
> fact1I:fact3L         1 8.988e-10 8.988e-10    48  0.0727  0.788577   
> fact1I:fact3M         1 1.957e-08 1.957e-08    48  1.5834  0.214351   
> fact2I:fact3L         1 3.741e-09 3.741e-09    48  0.3027  0.584749   
> fact2I:fact3M         1 3.207e-08 3.207e-08    48  2.5949  0.113767   
> fact1I:fact2I:fact3L  1 2.785e-09 2.785e-09    48  0.2253  0.637162   
> fact1I:fact2I:fact3M  1 7.357e-09 7.357e-09    48  0.5952  0.444206   
> ---
> I hope that by providing a sample of the data someone can help me out  
> on the questions I asked in my previous mail:
>
John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.