[R] Type II and III sum of square in Anova (R, car package)

John Fox jfox at mcmaster.ca
Sun Aug 27 15:51:57 CEST 2006


Dear Amasco,

A complete explanation of the issues that you raise is awkward in an email,
so I'll address your questions briefly. Section 8.2 of my text, Applied
Regression Analysis, Linear Models, and Related Methods (Sage, 1997) has a
detailed discussion.

(1) In balanced designs, so-called "Type I," "II," and "III" sums of squares
are identical. If the STATA manual says that Type II tests are only
appropriate in balanced designs, then that doesn't make a whole lot of sense
(unless one believes that Type-II tests are nonsense, which is not the
case).

(2) One should concentrate not directly on different "types" of sums of
squares, but on the hypotheses to be tested. Sums of squares and F-tests
should follow from the hypotheses. Type-II and Type-III tests (if the latter
are properly formulated) test hypotheses that are reasonably construed as
tests of main effects and interactions in unbalanced designs. In unbalanced
designs, Type-I sums of squares usually test hypotheses of interest only by
accident. 

(3) Type-II sums of squares are constructed obeying the principle of
marginality, so the kinds of contrasts employed to represent factors are
irrelevant to the sums of squares produced. You get the same answer for any
full set of contrasts for each factor. In general, the hypotheses tested
assume that terms to which a particular term is marginal are zero. So, for
example, in a three-way ANOVA with factors A, B, and C, the Type-II test for
the AB interaction assumes that the ABC interaction is absent, and the test
for the A main effect assumes that the ABC, AB, and AC interaction are
absent (but not necessarily the BC interaction, since the A main effect is
not marginal to this term). A general justification is that we're usually
not interested, e.g., in a main effect that's marginal to a nonzero
interaction.

(4) Type-III tests do not assume that terms higher-order to the term in
question are zero. For example, in a two-way design with factors A and B,
the type-III test for the A main effect tests whether the population
marginal means at the levels of A (i.e., averaged across the levels of B)
are the same. One can test this hypothesis whether or not A and B interact,
since the marginal means can be formed whether or not the profiles of means
for A within levels of B are parallel. Whether the hypothesis is of interest
in the presence of interaction is another matter, however. To compute
Type-III tests using incremental F-tests, one needs contrasts that are
orthogonal in the row-basis of the model matrix. In R, this means, e.g.,
using contr.sum, contr.helmert, or contr.poly (all of which will give you
the same SS), but not contr.treatment. Failing to be careful here will
result in testing hypotheses that are not reasonably construed, e.g., as
hypotheses concerning main effects.

(5) The same considerations apply to linear models that include quantitative
predictors -- e.g., ANCOVA. Most software will not automatically produce
sensible Type-III tests, however.

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Amasco 
> Miralisus
> Sent: Saturday, August 26, 2006 5:07 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Type II and III sum of square in Anova (R, car package)
> 
> Hello everybody,
> 
> I have some questions on ANOVA in general and on ANOVA in R 
> particularly.
> I am not Statistician, therefore I would be very appreciated 
> if you answer it in a simple way.
> 
> 1. First of all, more general question. Standard anova() 
> function for lm() or aov() models in R implements Type I sum 
> of squares (sequential), which is not well suited for 
> unbalanced ANOVA. Therefore it is better to use
> Anova() function from car package, which was programmed by 
> John Fox to use Type II and Type III sum of squares. Did I 
> get the point?
> 
> 2. Now more specific question. Type II sum of squares is not 
> well suited for unbalanced ANOVA designs too (as stated in 
> STATISTICA help), therefore the general rule of thumb is to 
> use Anova() function using Type II SS only for balanced ANOVA 
> and Anova() function using Type III SS for unbalanced ANOVA? 
> Is this correct interpretation?
> 
> 3. I have found a post from John Fox in which he wrote that 
> Type III SS could be misleading in case someone use some 
> contrasts. What is this about?
> Could you please advice, when it is appropriate to use Type 
> II and when Type III SS? I do not use contrasts for 
> comparisons, just general ANOVA with subsequent Tukey 
> post-hoc comparisons.
> 
> Thank you in advance,
> Amasco
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list