[BioC] limma - Considering Batch Effect

James MacDonald jmacdon at med.umich.edu
Tue Aug 7 19:47:06 CEST 2007


Hi Nathan,


Nathan Haigh wrote:
> I know this topic has arisen a few times from searching the mail
> archive, but I'm still not clear on what/how to do this.
> 
> Here is my experimental setup:
> Array    Param1   Param2   Rep   Batch
> 1        H        4        1     1
> 2        H        4        2     2
> 3        H        4        3     2
> 4        H        20       1     1
> 5        H        20       2     2
> 6        H        20       3     2
> 7        H        37       1     1
> 8        H        37       2     2
> 9        H        37       3     2
> 10       L        4        1     1
> 11       L        4        2     2
> 12       L        4        3     2
> 13       L        20       1     1
> 14       L        20       2     2
> 15       L        20       3     2
> 16       L        37       1     1
> 17       L        37       2     2
> 18       L        37       3     2
> 
> All rep 1's were processed in batch 1 and reps 2 and 3 were processes in
> batch 2. From using GeneSpring, and conducting a cluster analysis on
> experimental parameters, there is a clear batch effect. In GeneSpring, I
> normalised the arrays, such that gene1 from array1 was divided by the
> median of gene1 over arrays from batch1. Likewise, gene1 from array2 was
> divided by the median of gene1 across arrays in batch2. These
> successfully removed the batch effect. I'd like to be able to remove the
> batch effect doing something similar, or accounting for it in limma.
> 
> In essence, I want to remove the variability in the reps caused by the
> batch effect.
> 
> Could anyone explain how this can/should be done?

Sure. Just add the Batch effect to your design matrix. For instance, if 
you want to model the differences in Param1, you would do something like:

 > design <- model.matrix(~0+factor(rep(1:2, 
each=9))+factor(rep(rep(1:2,1:2),6)))
 > colnames(design) <- c("H","L","Batch")
 > design
    H L Batch
1  1 0     0
2  1 0     1
3  1 0     1
4  1 0     0
5  1 0     1
6  1 0     1
7  1 0     0
8  1 0     1
9  1 0     1
10 0 1     0
11 0 1     1
12 0 1     1
13 0 1     0
14 0 1     1
15 0 1     1
16 0 1     0
17 0 1     1
18 0 1     1
attr(,"assign")
[1] 1 1 2
attr(,"contrasts")
attr(,"contrasts")$`factor(rep(1:2, each = 9))`
[1] "contr.treatment"

attr(,"contrasts")$`factor(rep(rep(1:2, 1:2), 6))`
[1] "contr.treatment"

Then you can do the contrast between H and L.

Best,

Jim




> Thanks
> Nathan
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, MS
Biostatistician
UMCCC cDNA and Affymetrix Core
University of Michigan
1500 E Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list