[BioC] Linear Models and ANOVA

James W. MacDonald jmacdon at med.umich.edu
Fri Dec 17 16:12:26 CET 2010


Hi Thomas,

On 12/16/2010 4:03 PM, Thomas Hampton wrote:
> This is an off topic question related more to R and statistics, but I
> will impose myself, if you don't mind.

You are correct. This has nothing to do with Bioconductor, nor even the 
analysis of high-throughput data. You would be better served by asking 
on R-help, although you might need a fairly thick skin, depending on who 
replies, as this isn't really a question about R.

Alternatively, you could do some reading on your own to see why the 
output is different. See

?anova.lm
?summary.lm

which should clear up the confusion for you.

If that doesn't help, Julian Faraway has an excellent book that covers 
linear models in R. If you are lucky, you might even be able to find the 
pdf of that book somewhere out on the intertubes, as it was freely 
available in the past before he published.

Best,

Jim


>
> Here is my issue.
>
> R anova is essentially a way to interpret some linear model such as
>
> fit <- lm(y ~a*b)
>
> You can generate nice p values by doing something like
>
> anova(lm(y ~a*b))
>
> But you could also generate p values like this:
>
> summary(lm(y~a*b))
>
> I find though, that the p values you generate may be different
> depending on whether you call summary.lm or whether
> you get them from anova.lm.
>
> For example:
>  > data.ex2=read.table(datafilename,header=TRUE)
>  > summary(lm(formula = Alertness ~ Gender * Dosage, data = data.ex2))
>
> Call:
> lm(formula = Alertness ~ Gender * Dosage, data = data.ex2)
>
> Residuals:
> Min 1Q Median 3Q Max
> -6.500 -3.375 0.000 1.562 10.500
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 15.750 2.546 6.185 4.69e-05 ***
> Genderm -4.500 3.601 -1.250 0.235
> Dosageb 1.000 3.601 0.278 0.786
> Genderm:Dosageb 0.250 5.093 0.049 0.962
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 5.093 on 12 degrees of freedom
> Multiple R-squared: 0.2079, Adjusted R-squared: 0.009862
> F-statistic: 1.05 on 3 and 12 DF, p-value: 0.4062
>
>  > anova(lm(formula = Alertness ~ Gender * Dosage, data = data.ex2))
> Analysis of Variance Table
>
> Response: Alertness
> Df Sum Sq Mean Sq F value Pr(>F)
> Gender 1 76.562 76.562 2.9518 0.1115
> Dosage 1 5.062 5.062 0.1952 0.6665
> Gender:Dosage 1 0.063 0.063 0.0024 0.9617
> Residuals 12 311.250 25.938
>
>
> The anova output is tidier to look at. But why are the anova p values
> smaller
> for Gender and Dosage?
>
>
> Thanks for your help.
>
> Tom
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list