[R] anova.lm F test confusion

peter dalgaard pdalgd at gmail.com
Wed Mar 21 11:25:49 CET 2012

On Mar 21, 2012, at 09:04 , Rolf Turner wrote:

> On 21/03/12 20:19, Gerrit Eichner wrote:
>> Dear Ben, or anybody else, of course,
>> I'd be grateful if you could point me to a reference (different from ch. 4 "Linear models" in "Statistical Models in S" (Chambers & Hastie (1992))) regarding the (asserted F-)distributional properties of the test statistic (used, e.g., by anova.lm()) to compare model 1 with model 2 using the MSE of model 3 in a sequence of three nested (linear) models? (A short RSiteSearch() and a google search didn't lead me far ...)
>> Thx in advance!
> A good, if somewhat dry, reference on this is "Theory and Application of the
> Linear Model" by Franklin A. Graybill, Duxbury, 1976.
> There are of course many, *many* other such books.

The whole thing is of course a fairly straightforward consequence of Fisher-Cochran's theorem. This says that, in the absence of systematic effects, the Sums of Squares in the ANOVA tables are proportional to independent chi-square variables. Hence, the ratio of any pair of Mean Squares or pooled Mean Squares has an F distribution. This holds for the sort of ANOVA tables that decompose the total sum of squares (i.e, not the drop1 or TypeII-IV style of table)

The convention of dividing by the "overall error" term rather than successively pooling terms according to model reductions is pervasive both in statistical literature and in software. This stems from the emphasis on balanced designs in the days before electronic computers: When you know that the sums of squares are independent of the testing order, you rather like to have the same property for the F tests, so that all conclusions can be conveniently read from a single ANOVA table. 

By and large, it doesn't make a whole lot of difference whether you gain a few denominator DF. It does, however, imply that the F tests with a common denominator are not independent, as opposed to the "proper" successive F tests.

Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

More information about the R-help mailing list