[BioC] Batch effect in limma

James W. MacDonald jmacdon at med.umich.edu
Thu Sep 3 15:27:01 CEST 2009


Hi Yuan,

Yuan Hao wrote:
> Dear list,
> 
> I got 12 Affymetrix arrays for 4 RNA samples, 3 replicates(from 3 
> batches individually) for each sample. I want to look at the 
> differential expression between these samples. While after clustering, 
> we found obvious batch effect within the data, so we decided to add the 
> batch effect into the linear model. I set up the design matrix as 
> followings, but we can't find any differentially expressed gene that 
> were expected, so would someone help me to have a look whether there is 
> any problem with my design and contrast matrix? Thank you very much in 
> advance:

I don't see any problems here. When you say you can't find any 
differentially expressed gene that were expected, is that to imply that 
you _did_ find differentially expressed genes, but not those that you 
expected? Or do you mean that there are no significantly differentially 
expressed genes?

Also note that you will need to specify a contrast with topTable() if 
you want the genes for a particular comparison.


Best,

Jim


> 
> Array    Sample    Batch
> 1        -/-        3
> 2        +/+        3
> 3        -/+        1
> 4        +/+        1
> 5        -/+        3
> 6        +/-        3
> 7        -/-        1
> 8        +/-        1
> 9        -/-        2
> 10        +/-        2
> 11        +/+        2
> 12        -/+        2
> 
>  > S <- 
> c("-/-","+/+","-/+","+/+","-/+","+/-","-/-","+/-","-/-","+/-","+/+","-/+")
>  > S <- factor(S, level = c("+/+","-/+","+/-","-/-"))
>  > B <- c(3,3,1,1,3,3,1,1,2,2,2,2)
>  > B <- factor(B, level = c(1,2,3))
>  > design <- model.matrix(~0+S+B)
>  > 
> colnames(design)<-c("sample1","sample2","sample3","sample4","Batch3","Batch2") 
> 
>  > design
> 
>  sample1 sample2 sample3 sample4 Batch3 Batch2
> 
> 1        0       0       0       1      1      0
> 
> 2        1       0       0       0      1      0
> 
> 3        0       1       0       0      0      0
> 
> 4        1       0       0       0      0      0
> 
> 5        0       1       0       0      1      0
> 
> 6        0       0       1       0      1      0
> 
> 7        0       0       0       1      0      0
> 
> 8        0       0       1       0      0      0
> 
> 9        0       0       0       1      0      1
> 
> 10       0       0       1       0      0      1
> 
> 11       1       0       0       0      0      1
> 
> 12       0       1       0       0      0      1
> 
> attr(,"assign")
> 
> [1] 1 1 1 1 2 2
> 
> attr(,"contrasts")
> 
> attr(,"contrasts")$TS
> 
> [1] "contr.treatment"
> 
> 
> 
> attr(,"contrasts")$BE
> 
> [1] "contr.treatment"
> 
>  > fit<-lmFit(eset.gcrma,design)
> 
>  > 
> contrast.matrix<-makeContrasts(a=sample1-sample2,b=sample1-sample3,c=sample2-sample4,levels=design) 
> 
> 
>  > contrast.matrix
> 
>         Contrasts
> 
> Levels     a  b  c
> 
>  sample1  1  1  0
> 
>  sample2 -1  0  1
> 
>  sample3  0 -1  0
> 
>  sample4  0  0 -1
> 
>  Batch3   0  0  0
> 
>  Batch2   0  0  0
> 
>  > fit2<-contrasts.fit(fit,contrast.matrix)
> 
>  > fit2<-eBayes(fit2)
> 
> 
> Cheers,
> Yuan
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826



More information about the Bioconductor mailing list