[R] Odd anova(lm()) order phenomenon, looking for an explanation

Andrew Robinson A.Robinson at ms.unimelb.edu.au
Fri Mar 31 11:17:08 CEST 2006


Hi everyone,

I'm witnessing an odd modelling phenomenon that I can't explain.  If
anyone has seen this before, or can explain what's going on would let
me know, I'd be very grateful!  Especially if I'm just being dim.

I'm fitting a pair of continuous variates and their interaction to
some residuals from another model.  The sequential anova statement
changes with the term order; that's fine.  But each term explains a
much larger Sum Sq when it is listed second than when it is listed
first.



> anova(lm(residuals(delta.point.lm.0) ~ canopy.h + canopy.d,
+          data=snow))
Analysis of Variance Table

Response: residuals(delta.point.lm.0)
           Df Sum Sq Mean Sq F value    Pr(>F)    
canopy.h    1  156.2   156.2  11.118 0.0009613 ***
canopy.d    1  198.0   198.0  14.098 0.0002080 ***
Residuals 303 4256.6    14.0                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

> anova(lm(residuals(delta.point.lm.0) ~ canopy.d + canopy.h,
+          data=snow))
Analysis of Variance Table

Response: residuals(delta.point.lm.0)
           Df Sum Sq Mean Sq F value    Pr(>F)    
canopy.d    1    0.4     0.4  0.0284    0.8664    
canopy.h    1  353.8   353.8 25.1871 8.887e-07 ***
Residuals 303 4256.6    14.0                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 



I would have expected any term to explain less Sum Sq if listed second
than if listed first.  Is my intuition awry?  Does anyone have any
modelling insight to help me interpret what I'm seeing?

Cheers

Andrew
-- 
Andrew Robinson  
Department of Mathematics and Statistics            Tel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
Email: a.robinson at ms.unimelb.edu.au         http://www.ms.unimelb.edu.au




More information about the R-help mailing list