[R] AOV and Error

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Aug 16 00:00:51 CEST 2008


Prof Brian Ripley wrote:
> See the reference on ?aov, and MASS (the book, see the FAQ).
>
> I think you need to understand the underlying theory first, and that 
> is no longer (even for my time) part of a statistical education.  I 
> learnt it from Bill Venables who has educated in the 1960s -- so his 
> account in MASS comes with at least one satisfied client.
>
Hmm, I'm younger than Brian and I did study this extensively, based on 
the description in the Genstat manual (1977) and Tue Tjur's lecture 
notes (later developed into his 1984 paper in Int.Statist.Rev 52, pp. 
33-65.)

The way I prefer to think about it is the following. It works only when 
the error model is completely balanced and factorial, but there are 
hardly any other models that are interpretable.

Assume for the sake of discussion a complete two-way layout (A*B) within 
Subject. A relevant model could be  y ~ A*B + Error(Subj/(A*B))

Start by expanding the Error() terms into simple interactions, i.e. 
Subj/(A*B) = Subj + Subj:A + Subj:B + Subj:A:B. Each term defines a 
table containing a (constant) number of observations in each cell, and 
the error model is that there is a variance component that is common to 
observations within the same cell, but has independent contributions to 
different cells.

This error model defines a decomposition of data into "error strata" 
which corresponds to certain contrasts of means: Variation  of subject 
means around the grand mean, variation of within-subject "A" means 
around the subject mean. Ditto for the "B" means, and finally the 
residual, alias the within-subject interaction contrasts.

There are now two crucial points: (1) You can treat each component as if 
it had been based on independent data with a different variance for each 
stratum, and (2) in "nice" (orthogonal) designs it turns out that the 
systematic terms distribute into error strata, so that significance of A 
is evaluated in the Subj:A stratum, etc.

(As you see, this easily gets long-winded to explain, and I even glossed 
over a number of rather important details.)

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list