[BioC] combat error message

Johnson, William Evan wej at bu.edu
Wed May 7 23:18:46 CEST 2014


Hi Guan, 

I think in your case the correct thing to do is to change Base1 and Base2 to Base. However, if you are only interested in comparing 'Post7-Base1' and 'During4 -Base2' then it seems that you are fine doing the adjustment in two separate batches as well. So it is up to you.

Hope this helps.

Evan




On May 5, 2014, at 5:26 AM, Guan Wang <Guan.Wang at glasgow.ac.uk> wrote:

> Dear Amit and Evan,
> 
> Sorry to write to you out of blue. I read your post http://permalink.gmane.org/gmane.science.biology.informatics.conductor/49978 regarding a combat error message as having had the same problem.  
> 
> Your post helped me to understand what was the reason. Have several other questions related to the analysis strategy given the error. I posted these through bioconductor mailing list a few days ago, however, have not received further opinions. Not sure if you may take a few minutes to have a look at below? Many thanks for your time and any suggestions you may have. 
> 
> Post from bioconductor attached below. Thanks.
> 
> Hi All,
> 
> I understood from the preivous post "[BioC] ComBat_ Error in solve.default(t(design) %*% design): Lapack routine dgesv: system is exactly singular: U[4, 4] = 0" that this error is to do with the confounded batch and covariate status. I have the same ComBat_Error appeared when running surrogate variable analysis (SVA) and have several other related questions. Hope you could have a look. Many thanks for any opinions/suggestions.
> 
> Data set: 24 samples from 6 subjects (4 time points/subject: 2 baseline samples collected on different days, 1 during drug treatment, 1 after drug treatment). Experiments were done with Affymetrix GeneChip 3.0 for miRNA expression profiling. 
> 
> Initial data analysis: "oligo" is used to handle Affy CEL files, "rma()" is used for data normalization. After this, I still see PC1 seems to correlate with certain batch effect (which I'm not aware, i.e. not come from different 
> scan dates) on the PCA plot. Then "sva" package is used to estimate the surrogate variables, followed by "ComBat()". 
> 
> Now, come to the ComBat_Error, when I specified the contrasts as (Base2-Base1, During-Base1, Post-Base1). The pheno input attached below:
> 
> 	                        sample	batch	Status
> GW2miRNA1_(miRNA-3_0).CEL	1	1	Base1
> GW2miRNA2_(miRNA-3_0).CEL	1	1	Post7
> GW2miRNA3_(miRNA-3_0).CEL	2	1	Base1
> GW2miRNA4_(miRNA-3_0).CEL	2	1	Post7
> GW2miRNA5_(miRNA-3_0).CEL	3	1	Base1
> GW2miRNA6_(miRNA-3_0).CEL	3	1	Post7
> GW2miRNA7_(miRNA-3_0).CEL	4	1	Base1
> GW2miRNA8_(miRNA-3_0).CEL	4	1	Post7
> GW2miRNA9_(miRNA-3_0).CEL	5	1	Base1
> GW2miRNA10_(miRNA-3_0).CEL	5	1	Post7
> GW2miRNA11_(miRNA-3_0).CEL	6	1	Base1
> GW2miRNA12_(miRNA-3_0).CEL	6	1	Post7
> GW1miRNA13_(miRNA-3_0).CEL	6	2	Base2
> GW1miRNA14_(miRNA-3_0).CEL	6	2	During4
> GW1miRNA15_(miRNA-3_0).CEL	4	2	Base2
> GW1miRNA16_(miRNA-3_0).CEL	1	2	During4
> GW1miRNA17_(miRNA-3_0).CEL	5	2	Base2
> GW1miRNA18_(miRNA-3_0).CEL	5	2	During4
> GW1miRNA19_(miRNA-3_0).CEL	4	2	During4
> GW1miRNA20_(miRNA-3_0).CEL	3	2	Base2
> GW1miRNA21_(miRNA-3_0).CEL	3	2	During4
> GW1miRNA22_(miRNA-3_0).CEL	1	2	Base2
> GW1miRNA23_(miRNA-3_0).CEL	2	3	During4
> GW1miRNA24_(miRNA-3_0).CEL	2	3	Base2
> 
> I understand that the batch is confounded with the status as you could see in the phenotype file above. Since the two baseline samples are from same subjects, however, collected on different days before injecting the drug. I'm thinking whether it makes sense to classify "Base1 + Base2" as "Base", and make contrasts for "During - Base" and "Post - Base". Other columns in above pheno file will be kept the same and re-run the "sva"? Or is it more appropriate to do two separate "sva" analyses, i.e. "Post7 - Base1" for first 12 samples as hybridized and scanned at the same time and "During4 - Base2" for the last 12 samples as they were treated as a batch (however, scanned at two times, that's why they were labelled as batch 2 and 3 in column of "batch").
> 
> Hope I've described clearly. Much appreciated for suggestions/opinions.
> 
> Regards
> Guan



More information about the Bioconductor mailing list