[BioC] Anova; can I use lmFit and eBayes from Limma, or should it be aov?

Gordon K Smyth smyth at wehi.EDU.AU
Tue Nov 20 03:08:57 CET 2012


Dear John,

What limma does is always equivalent to Anova.  Of course it is not 
classical Anova, but rather an empirical Bayes extension of Anova that is 
appropriate for microarray data.

Post hoc tests are done in limma using decideTests(), and many options are 
offered.  You won't find classical methods like TukeyHSD though, because 
limma isn't doing classical Anova and because methods like TukeyHSD don't 
generalize well to high-dimensional datasets like microarrays.

There are a couple of problems with your code below.

Firstly, you have overlooked the fact that limma, unlike classical 
Anova, does not automatically adjust for the intercept term.  (This is for 
historical reasons because limma was originally developed for two colour 
microarrays.)  Instead limma expects you to specify which coefficients or 
contrasts you want to test for.  When you run

   topTable(fit2)

without specifying any coefficients, limma will test whether any of the 
coefficients are different from zero, including the intercept term. 
You can see that it has done this because it has included a column for 
the intercept in the toptable.  This is not what you want.

Secondly, you have failed to specify in your model that patient should be 
a factor in your model rather than a numerical covariate.  It is also 
unclear to me whether time should be a factor or a covariate.  At the 
moment both are being treated as a covariates.

It is a good idea to look at colnames(design) or colnames(fit) to check 
that the terms in the model are what you intended.

Once you fix your model, you can conductor Anova-style F-tests using limma 
simply by specifying a range of coefficients to topTable(), as discussed 
in the User's Guide.

Best wishes
Gordon


> Date: Mon, 19 Nov 2012 10:43:35 +0000
> From: john herbert <arraystruggles at gmail.com>
> To: Bioconductor mailing list <bioconductor at r-project.org>,
> 	bioconductor at stat.math.ethz.ch
> Subject: [BioC] Anova; can I use lmFit and eBayes from Limma, or
> 	should it be aov?
>
> Dear Bioconductors,
> This may be more a question of statistics understanding but it is
> related to Limma.
>
> Is it possible to perform an Anova using functions from the Limma
> package? If so, how can a post hoc test be applied to the eBayes
> "MArrayLM" resulting object?
>
> For me, it is a complex design I have array data, different clients,
> time course, stage of disease and treatment. The Limma manual eludes
> to Limma being equivalent to Anova in some cases for single channel
> arrays (on pages 35 and 41). I am using single channel data.
>
> This code runs;
>
> dd <- data.frame(patient=targets$patient,stage=as.factor(targets$stage),treat=as.factor(targets$treat),time=targets$time)
>
> head(dd)
>
>  patient stage treat time
> 1       1     A     Y   24
> 2       1     A     Y   72
> 3       2     B     N   24
> 4       2     B     N   72
> 5       2     B     N    0
> 6       2     B     Y   72
>
> design <- model.matrix(~patient+stage*treat*time,data=dd)
>
> fit <- lmFit(temp,design)
> fit2 <- eBayes(fit)
>
> topTable(fit2)
>                       ID X.Intercept.       patient      stageB
> stageC       treatY          time stageB.treatY stageC.treatY
> stageB.time   stageC.time
> A_23_P113716 A_23_P113716     17.57842 -0.0042698998 -0.18798801
> -0.18726778 -0.008996453  1.282717e-03    0.38532184    0.18501711
> 0.0025256233  0.0005692329
> A_23_P77779   A_23_P77779     17.07476  0.0015734113 -0.14199010
> -0.07687905 -0.132257119  8.288697e-05    0.04068160    0.03375537
> 0.0024200711  0.0014022044
>
>               treatY.time stageB.treatY.time stageC.treatY.time
> AveExpr         F       P.Value     adj.P.Val
> A_23_P113716 -1.340200e-03      -0.0072744604      -0.0022265946
> 17.49773 144347.32 6.481891e-159 2.657575e-154
> A_23_P77779  -8.857291e-04       0.0008166484      -0.0004867999
> 17.01806  98921.73 8.930817e-153 1.830817e-148
>
>
> How do I interpret the results, if I am performing Anova in Limma
> correctly? What are the numbers for X.intercept and stageB etc?
>
> If I take all probes into a variable with
> top_data <- topTable(fit2,n=200000000)
> and look for ones that are not significant, I don't find any.
> which(top_data$adj.P.Val > 0.01)
> integer(0)
>
> So it looks like every single probe varies significantly between any
> of the conditions, which my instincts say is not correct.
>
> Maybe I should run aov like this instead? But there is still a problem
> of post hoc testing as
>
> aof2 <- function(x,y){
>  aov(x ~ stage*treat*time+Error(patient), y)
> }
> anovaresults <- apply(temp, 1, aof2,dd)
>
> lapply(anovaresults,TukeyHSD)
> Error in UseMethod("TukeyHSD") :
>  no applicable method for 'TukeyHSD' applied to an object of class
> "c('aovlist', 'listof')"
>
> If anyone knows how to solve either of these issues, it will be appreciated.
>
> Thanks,
>
> John.
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list