[BioC] Batch effect confounded with additional covariate

Jie Yang [guest] guest at bioconductor.org
Sun Jul 27 01:26:05 CEST 2014


Hi, everyone. 

I have a question about the function "ComBat" in the package "sva". I am using the ComBat to remove the batch effect in methylation data. 

For the test data, the sample information is:

sample	covariate	batch
101	1	A
102	1	A
103	1	B
104	1	B
201	2	A
202	2	A
203	2	C
204	2	C

There are 3 batches and 1 covariate with two elements.
It will works smoothly. 

However, if I change the data like this:

sample	covariate	batch
101	1	A
102	1	A
103	1	B
104	1	B
201	2	D
202	2	D
203	2	C
204	2	C

Then it comes out a error message:
ComBat failed… the batch effect is confounded with the covariate.

I searched the google group about this question, the answer given is: the difference between the batch B and batch D may come from the covariate 1 and covariate 2. So that is why the effect is confounded.

I think it may because of the algorithm, but all my real data is like that. Each batch only belongs to one element of the covariate. like this:

sample	covariate	batch
101	1	A
102	1	A
103	1	B
104	1	B
105	1	C
106	1	C
201	2	D
202	2	D
203	2	E
204	2	E
205	2	F
206	2	F

So, is there anybody come up with some ideas? (Especially Dr. Evan Johnson). 
Thank you very much

Jie Yang
Graduate student
UTHealth at Houston
School of Public Health





 -- output of sessionInfo(): 

R version 3.0.3 (2014-03-06)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] IlluminaHumanMethylation450kmanifest_0.4.0 ChAMP_1.2.7                               
 [3] Illumina450ProbeVariants.db_0.99.3         ChAMPdata_0.99.6                          
 [5] minfi_1.8.9                                bumphunter_1.2.0                          
 [7] locfit_1.5-9.1                             iterators_1.0.7                           
 [9] foreach_1.4.2                              Biostrings_2.30.1                         
[11] GenomicRanges_1.14.4                       XVector_0.2.0                             
[13] IRanges_1.20.7                             reshape_0.8.5                             
[15] lattice_0.20-29                            Biobase_2.22.0                            
[17] BiocGenerics_0.8.0                        

loaded via a namespace (and not attached):
 [1] annotate_1.40.1       AnnotationDbi_1.24.0  base64_1.1            beanplot_1.1          cluster_1.15.2       
 [6] codetools_0.2-8       corpcor_1.6.6         DBI_0.2-7             digest_0.6.4          DNAcopy_1.36.0       
[11] doRNG_1.6             genefilter_1.44.0     grid_3.0.3            illuminaio_0.4.0      impute_1.36.0        
[16] itertools_0.1-3       limma_3.18.13         marray_1.40.0         MASS_7.3-33           matrixStats_0.10.0   
[21] mclust_4.3            multtest_2.18.0       nlme_3.1-117          nor1mix_1.1-4         pkgmaker_0.22        
[26] plyr_1.8.1            preprocessCore_1.24.0 R.methodsS3_1.6.1     RColorBrewer_1.0-5    Rcpp_0.11.2          
[31] registry_0.2          rngtools_1.2.4        RPMM_1.10             RSQLite_0.11.4        siggenes_1.36.0      
[36] splines_3.0.3         stats4_3.0.3          stringr_0.6.2         survival_2.37-7       sva_3.8.0            
[41] tools_3.0.3           wateRmelon_1.2.2      XML_3.95-0.2          xtable_1.7-3   

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list