[R] lm and aov produce different results for nested fixed-factor anova

Mark Difford mark_difford at yahoo.co.uk
Fri Feb 20 18:35:25 CET 2009


Hi Sergii,

>> I have trouble obtaining the same results for nested Anova with two fixed
>> factors when using 
>> lm and aov functions.

There is no difference between the two if you treat them equally, i.e. if
you summarize them in the same way.

## Try:
anova(e2)
summary(e1)

## Or:
summary.lm(e1)
summary(e2)

Your nested model is also unusual: customary/correct is: 

e2 <- lm( z/x - 1)

The nesting introduces an intercept for each level of the nesting factor.
Also, x presently is a factor (with 48 levels). Is that what you really want
?

Regards, Mark.



Sergii Ivakhno wrote:
> 
> Dear  R users,
> 
> I have trouble obtaining the same results for nested Anova with two fixed
> factors when using lm and aov functions. 
> 
> The formulas are:
> 
>> e1=aov(y~x/z)
> 
>> e2=lm(y~x/z)
> 
>  
> 
> summary(e1)
> 
>                Df Sum Sq Mean Sq F value    Pr(>F)
> 
> x              47  260.0     5.5 18.0088 < 2.2e-16 ***
> 
> x:z           195  169.6     0.9  2.8318 < 2.2e-16 ***
> 
> Residuals   14425 4430.3     0.3
> 
> ---
> 
> Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1
> 
> 2 observations deleted due to missingness
> 
>  
> 
> For e2
> 
> Residual standard error: 0.5542 on 14425 degrees of freedom
> 
>   (2 observations deleted due to missingness)
> 
> Multiple R-squared: 0.08839,    Adjusted R-squared: 0.07309
> 
> F-statistic: 5.779 on 242 and 14425 DF,  p-value: < 2.2e-16
> 
>  
> 
>  
> 
>  
> 
> I prefer to use lm, as in my case I want to know the difference between
> the first control group and all the other factors though regression
> coefficients. The same is true for levels of the nested factor within each
> level of the main factor. 
> 
>  
> 
> Since I am fairly novice to running linear models in R, I am not sure what
> can cause this problem; it also seems that lm does not provide the
> decomposition of MS into MS(x) and MS(z) and corresponding F-test
> statistics. (Is this possible to estimate them from lm output?)
> 
>  
> 
> Finally, few words about the dataset: main factor x has 48 levels,
> repeated from 60 to 540 times and represents different patients. The
> nested factor z has 9 levels, but not all of them occur within levels of
> factor x. Although the nested factor levels are independent between each
> of the main factor (i.e. they samples taken from different tissues of each
> patients), considering the large size of the dataset I was advised on this
> forum to use the same encoding of levels of nested  factor z at each level
> of factor x. I am not sure if this influences QR decomposition and leads
> to differences that I observe.
> 
> I would most appreciate your help as after reading help pages I still can
> not understand the cause for lm vs aov discrepancy.
> 
> The dataset with three factors can be downloaded from
> 
> http://www.compbio.group.cam.ac.uk/Resources/Sergii_temp/example.RData 
> 
>  
> 
> Thank you,
> 
> Sergii  
> 
>  
>  
>   
> ----------------------------------------------
> Sergii Ivakhno
> 
> PhD student
> 
> Computational Biology Group
> Cancer Research UK Cambridge Research Institute
> Li Ka Shing Centre
> Robinson Way
> Cambridge CB2 0RE
> England
> 
> +44 (0)1223 404293 (O)
> +44 (0)1223 404128 (F)
> 
> http://www.compbio.group.cam.ac.uk <http://www.compbio.group.cam.ac.uk/> /
> 
> 
> This communication is from Cancer Research UK. Our website is at
> www.cancerresearchuk.org. We are a charity registered under number 1089464
> and a company limited by guarantee registered in England & Wales under
> number 4325234. Our registered address is 61 Lincoln's Inn Fields, London
> WC2A 3PX. Our central telephone number is 020 7242 0200.
> 
> This communication and any attachments contain information which is
> confidential and may also be privileged.   It is for the exclusive use of
> the intended recipient(s).  If you are not the intended recipient(s)
> please note that any form of disclosure, distribution, copying or use of
> this communication or the information in it or in any attachments is
> strictly prohibited and may be unlawful.  If you have received this
> communication in error, please notify the sender and delete the email and
> destroy any copies of it.
> 
> E-mail communications cannot be guaranteed to be secure or error free, as
> information could be intercepted, corrupted, amended, lost, destroyed,
> arrive late or incomplete, or contain viruses.  We do not accept liability
> for any such matters or their consequences.  Anyone who communicates with
> us by e-mail is taken to accept the risks in doing so.
> 	[[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/lm-and-aov-produce-different-results-for-nested-fixed-factor-anova-tp22119054p22124958.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list