[BioC] LIMMA: testing for batch effects

Gordon Smyth smyth at wehi.edu.au
Thu Apr 13 01:00:17 CEST 2006


>Date: Tue, 11 Apr 2006 15:08:55 +0000
>From: Adaikalavan Ramasamy <ramasamy at cancer.org.uk>
>Subject: [BioC] LIMMA: testing for batch effects
>To: BioConductor mailing list <bioconductor at stat.math.ethz.ch>
>
>Dear all,
>
>We have 63 arrays that are either diseased or normal. We wish to select
>genes adjusting for two covariates : gender (male/female) and
>experimental batch (one/two/three).
>
> >From biological knowledge, we expect the batch effect to be significant
>but we wish to quantify it numerically. Here is a way that we tried and
>the problems we faced. We searched the archives without much success.
>
>
>  library(limma)  # version 2.3.3
>  dd  <- model.matrix( ~ disease + gender + batch, cl )
>  head( dd )
>        (Intercept) diseasenormal gendermale batchtwo batchthree
>  2405            1             1          0        0          0
>  2408            1             1          0        0          0
>  2410            1             1          0        0          0
>  GER15           1             0          0        0          0
>  GER20           1             0          0        0          1
>  GER22           1             0          0        0          0
>
>  fit  <- lmFit(mds.rma, dd)
>
>
>1) Is the following the correct way of setting up contrasts to find the
>three pairwise comparison between batches ?
>
>  contr.m <- cbind( Batch2minus1=c(0,0,0,1,0), Batch2minus1=c(0,0,0,0,1),
>                    Batch3minus2=c(0,0,0,-1,1) )
>
>
>2) From reading the LIMMA user guide, we think decideTests() could be
>potentially useful.
>
>  fit2 <- eBayes( contrasts.fit( fit, contr.m ) )
>  a <- decideTests( fit2, method="global" )
>  summary(a)
>     Batch2minus1 Batch2minus1 Batch3minus2
>  -1        15151        13838         4919
>   0        26574        30561        46703
>   1        12950        10276         3053
>
>Can we say that somewhere between 8000 - 27000 genes are affected by at
>least one batch. Or is there a nicer/proper way of explaining this to a
>biologist.
>
>
>3) Ideally we would like to use method="nestedF" as suggested but we get
>the following error message. Can anyone explain what might possibly be
>going wrong.
>
>  b <- decideTests( fit2, method="nestedF" )
>  Error in if (crossprod(crossprod(Q, x)) > qF[i]) { :
>         missing value where TRUE/FALSE needed

This was a bug that persisted for a short time after the built-in R 
function p.adjust() changed from "fdr" to "BH" in R 2.2.0. The 
change-log entry for limma 2.3.6 reads:

6 November 2005: limma 2.3.6

- decideTests() failed with method="nestedF",adjust.method="BH", now fixed

Regards
Gordon


>Any hints will be much appreciated.
>
>Regards,
>
>--
>Adaikalavan Ramasamy                    ramasamy at cancer.org.uk
>Centre for Statistics in Medicine       http://www.csm-oxford.org/
>Wolfson College Annexe                  http://www.stats.ox.ac.uk/~ramasamy/
>Linton Road                             Tel : 01865 284 408
>Oxford OX2 6UD                          Fax : 01865 284 424



More information about the Bioconductor mailing list