[R] Why can repeated measures anova with within & between subjects design not be done if group sizes are unbalanced?

Yuelin Li liy12 at mskcc.org
Thu Nov 8 16:52:58 CET 2007


Hope I am not too late joining this thread.  I believe the difference
between R and SPSS is because SPSS adjusts the Type III SS by the
harmonic mean of the unbalanced cell sizes.  This calculation is
discussed in Maxwell and Delaney (1990, pp. 271-297).

In short, the best explanation I can offer (details see below) is that
SPSS and R produces the same output if you tell SPSS to do SSTYPE(1)
or SSTYPE(2) instead of the default SSTYPE(3).  As discussed in
Maxwell and Delaney, the calculations of SS1 and SS2 do not involve
the harmonic mean.  Maxwell and Delaney discussed the pros and cons of
each type of Sums of Squares.  Apparently SPSS thinks that the
harmonic mean SS3 is the *right* analysis.  Like people who responded
before me, I'd also suggest the use of lme() in unbalanced designs.

Yuelin.

---- details -------
I used the Hays.df data:
http://www.psych.upenn.edu/~baron/rpsych/rpsych.html
And I added one between-subject variable:

Hays.df$grpuneven <- c(1,1,1,1,1,1,1,1,2,2,2,2) # n=8 in grp 1; 4 in grp 2

I ran aov(rt ~ grpuneven*color*shape + Error(subj/shape+color), data=Hays.df) 
which gives you the same output as SSTYPE(1) and
SSTYPE(2) using this syntax in SPSS.

GLM
  Sh1Col1 Sh2Col1 Sh1Col2 Sh2Col2 BY grpuneven
  /WSFACTOR = color 2 Polynomial shape 2 Polynomial
  /METHOD = SSTYPE(2)
  /CRITERIA = ALPHA(.05)
  /WSDESIGN = color shape color*shape
  /DESIGN = grpuneven .






-- Gilbert G wrote --|Sun (Nov/04/2007)[04:34]|--:

   Dear R people:
   
   I wish to switch from SPSS to R, but there is one particular type of
   ANOVA design that cannot be done in R.  Or more likely, it can be
   done, but it is nowhere documented.

   [... snip ...]
   
   Now, in R you would have something like, as anybody who does balanced
   repeated measures anova's might know:
   
   aov( RT ~ color * shape * MyGroup + Error( Subjects/( color*shape) )
   
   In spss you would have something like this (of course with the data
   organized slightly differently :
   
   GLM
   x1 x2 x3 x4 BY MyGroup
    /WSFACTOR = color 2 Polynomial shape 2 Polynomial
    /METHOD = SSTYPE(3)
    /CRITERIA = ALPHA(.05)
    /WSDESIGN = color shape color*shape
    /DESIGN = VAR00001 .
   
   Ok, the question is.  If the group sizes are different (say 10 people
   in one group and 12 people in the other group) R is going to give the
   wrong answer.  Of course that is not R's fault.
   
   BUT MY QUESTION IS: HOW TO GET THE UNBALANCED REPEATED MEASURES ANOVA RIGHT?
   

 
     =====================================================================
     
     Please note that this e-mail and any files transmitted with it may be 
     privileged, confidential, and protected from disclosure under 
     applicable law. If the reader of this message is not the intended 
     recipient, or an employee or agent responsible for delivering this 
     message to the intended recipient, you are hereby notified that any 
     reading, dissemination, distribution, copying, or other use of this 
     communication or any of its attachments is strictly prohibited.  If 
     you have received this communication in error, please notify the 
     sender immediately by replying to this message and deleting this 
     message, any attachments, and all copies and backups from your 
     computer.



More information about the R-help mailing list