# [R] Re: Enduring LME confusion… or Psychologists and Mixed-Effects

Christophe Pallier pallier at lscp.ehess.fr
Tue Aug 10 17:55:20 CEST 2004

```Hello,

>>
>> Suppose I have a typical psychological experiment that is a
>> within-subjects design with multiple crossed variables and a
>> continuous response variable. Subjects are considered a random
>> effect. So I could model
>> > aov1 <- aov(resp~fact1*fact2+Error(subj/(fact1*fact2))
>>
>> However, this only holds for orthogonal designs with equal numbers of
>> observation and no missing values. These assumptions are easily
>> violated so I seek refuge in fitting a mixed-effects model with the
>> nlme library.
>

I suppose that you have, for each subject, enough observations to
compute his/her average response for each combination of factor1 and
factor2, no?
If this is the case, you can perform the analysis with the above formula
on the data obtained by 'aggregate(resp,list(subj,fact1,fact2),mean)'.

This is an analysis with only *within-subject* factors and there
*cannot* be a problem of unequal number of observation when you have
only within-subject factors (supposing you have at least one
observations for each subject in each condition).

I believe the problem with unequal number of observations only  occurs
when you have at least two crossed *between-subject* (group) variables.

Let's imagine you have two binary group factors (A and B) yielding four
subgroups of subjects, and for some reason, you do have the same number
of observations in each subgroup,
Then there are several ways of defining the main effects of A and B.

In many cases, the most reasonable definition of the main effect of A is
to take the average of A in B1 and in B2 (thus ignoring the number of
observations, or weithting equally the four subgroups).
To test the null hypothesis of no difference in A when all groups are
equally weighted, one common approach in psychology is to pretend that
the number of observation is each group is equal to the harmonic mean of
the number of observations in each subgroups. The sums of square thud
obtained can be compared with the error sum of square in the standard
anova to form an F-test.
This is called the "unweighted" approach.

This can easily be done 'by hand' in R, but there is another approach:

You get equivalent statistics as in the unweighted anova when you use so
called 'type III' sums of square (I read this in Howell, 1987
'Statistical methods in psychology',
and in John Fox book 'An R and S-plus companion to appied regression, p.
140).

It is possible to get type III sums of square using John Fox 'car' library.

library(car)
contrasts(A)=contr.sum
contrasts(B)=contr.sum
Anova(aov(resp~A*B),type='III')

You can compute the equally weighted cell means defining the effect of A
with, say:

with(aggregate(resp,list(a=a,b=b),mean),tapply(x,a,mean))

I have seen some people advise against using 'type III' sums of square
but I do not know their rationale. The important thing, it seems to me,
is to know
which null hypothesis is  tested in a given test. If indeed the  type
III sums of square test the effect on equally weighted means, they seem
okay to me
(when this is indeed the hypothesis I want to test).

hope others will do), but I feel that 'lme' is not needed in the context
of unequal cell frequencies.
(I am happy to be corrected if I am wrong). It seems to me that 'lme' is
useful when some assumptions of standard anova are violate (e.g. with
repeated measurements when the assumption of sphericity is false), or
when you have several random factors.

Christophe Pallier
http://www.pallier.org

```