[R] Marginal (type II) SS for powers of continuous variables in a linear model?

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Aug 11 17:47:50 CEST 2003


Anova != anova.

drop1 is the part of R that does type II sum of squares, and it works in 
your example.  So does Anova in the current car:

> drop1(lm(y~x+I(x^2), Df))  # add test="F" if you like
Single term deletions

Model:
y ~ x + I(x^2)
       Df Sum of Sq    RSS    AIC
<none>              8.3117 5.2839
x       1    0.5490 8.8607 3.8596
I(x^2)  1    0.5772 8.8889 3.8882
> library(car)
> Anova(lm(y~x+I(x^2), Df))
Anova Table (Type II tests)

Response: y
          Sum Sq Df F value Pr(>F)
x         0.5490  1  0.3963 0.5522
I(x^2)    0.5772  1  0.4167 0.5425
Residuals 8.3117  6               


And in summary.aov() those *are* marginal SS, as balance is assumed
for aov models. (That is not to say the software does not work otherwise, 
but the interpretability depends on balance.)


On Mon, 11 Aug 2003, Spencer Graves wrote:

> I'm confused.  Consider the following example:
> 
>  > Df <- data.frame(x=1:9, y=rep(c(-1,1), length=9))
>  > anova(lm(y~x, Df))
> Analysis of Variance Table
> 
> Response: y
>            Df    Sum Sq   Mean Sq   F value Pr(>F)
> x          1 2.861e-34 2.861e-34 2.253e-34      1
> Residuals  7    8.8889    1.2698
>  > anova(lm(y~x+I(x^2), Df))
> Analysis of Variance Table
> 
> Response: y
>            Df    Sum Sq   Mean Sq   F value Pr(>F)
> x          1 2.861e-34 2.861e-34 2.065e-34 1.0000
> I(x^2)     1    0.5772    0.5772    0.4167 0.5425
> Residuals  6    8.3117    1.3853
>  >
>  > Df <- data.frame(x=1:9, y=rep(c(-1,1), length=9))
>  > anova(lm(y~x, Df))
> Analysis of Variance Table
> 
> Response: y
>            Df    Sum Sq   Mean Sq   F value Pr(>F)
> x          1 2.861e-34 2.861e-34 2.253e-34      1
> Residuals  7    8.8889    1.2698
>  > anova(lm(y~x+I(x^2), Df))
> Analysis of Variance Table
> 
> Response: y
>            Df    Sum Sq   Mean Sq   F value Pr(>F)
> x          1 2.861e-34 2.861e-34 2.065e-34 1.0000
> I(x^2)     1    0.5772    0.5772    0.4167 0.5425
> Residuals  6    8.3117    1.3853
>  > anova(lm(y~I(x^2)+x, Df))
> Analysis of Variance Table
> 
> Response: y
>            Df Sum Sq Mean Sq F value Pr(>F)
> I(x^2)     1 0.0282  0.0282  0.0203 0.8912
> x          1 0.5490  0.5490  0.3963 0.5522
> Residuals  6 8.3117  1.3853
>  >
> 	  In S-Plus 6.1, the ANOVA table is preceeded by a statement, "Terms 
> added sequentially (first to last)".  From these examples, it certainly 
> looks like this is what it is doing.  Apart from round off error, the 
> sum of squares and mean squares are identical for the models without and 
> with I(x^2).  In an example with a nonzero sum of squares for x, the F 
> value would be different, because the mean square for residuals would be 
> different, and the Pr(>F) would also be affected by differing degrees of 
> freedom.
> 
> 	  The third example here puts I(x^2) before x in the model statement 
> and gets a clearly different anova.  (The coefficients should be not 
> change when the order of the terms is modified, though they could change 
> if other terms are addeed.  I didn't check that for this example, but 
> I've done this before and would be surprised if they were different.)
> 
> Best Wishes,
> Spencer
> 
> Bjørn-Helge Mevik wrote:
> > I've used Anova() from the car package to get marginal (aka type II)
> > sum-of-squares and tests for linear models with categorical
> > variables.  Is it possible to get marginal SSs also for continuous
> > variables, when the model includes powers of the continuous variables?
> > 
> > For instance, if A and B are categorical ("factor"s) and x is
> > continuous ("numeric"),
> > 
> > Anova (lm (y ~ A*B + x, ...))
> > 
> > will produce marginal SSs for all terms (A, B, A:B and x).  However,
> > with 
> > 
> > Anova (lm (y ~ A*B + x + I(x^2), ...))
> > 
> > the SS for 'x' is calculated with I(x^2) present in the model, i.e. it
> > is no longer marginal.
> > 
> > Using poly (x, 2) instead of x + I(x^2), one gets a marginal SS for
> > the total effect of x, but not for the linear and quadratic effects
> > separately.  (summary.aov() has a 'split' argument that can be used to
> > get separate SSs, but these are not marginal.)

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list