[BioC] ComBat: Working with no replicates
pfurio at cipf.es
Tue Oct 29 15:24:09 CET 2013
Pekka Kohonen <pkpekka at ...> writes:
> Dear Pedro,
> If you have just one sample from the lab, how do you differentiate
> between the cell line-specific effect and the lab-specific effect? I
> don't see how you are trying to do with these 3 samples makes any
> sense. If you have the same cell lines measured in a different lab
> (which has enough samples to run ComBat) why not just use those then?
> Also, I wonder what is the minimum number of samples to estimate a
> lab-specific distribution (which is what Combat is doing) for each
> gene? Probably 5-10 samples or so?
> I think that statistics should not be treated as just a way to hack
> your data so that it appears to be OK. This sounds a bit like doing
> Best, Pekka
> P.S. my name in Finnish means "Pedro"
> 2013/10/28 Pedro Furió Tarí <pfurio at ...>:
> > Dear all,
> > We have a mix of cell-lines run by 12 different labs (more than 150 samples
> > in total) and we have found a strong batch effect by laboratory that we
> > would like to correct. From those 12, there are 3 labs that are bringing
> > just one cell-line with no replicates at all (1 sample).
> > If we remove the samples from those 3 labs, we are able to run ComBat, but
> > we would like to keep them if possible. Is there any way? If we simulate a
> > "false replicate" just by copying the same expression values it works.
> > Could it be the way to go? Could these results be trustworthy?
> > We also would like to use the different cell-line names as the covariates,
> > but some of them don't have any replicates, so it doesn't work. Is there
> > any way we could also use them as categorical covariates? Right now we are
> > not giving any covariates information.
> > Any help would be much appreciated :)
> > Thanks in advance,
> > Pedro
> > [[alternative HTML version deleted]]
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at ...
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> Bioconductor mailing list
> Bioconductor at ...
> Search the archives:
Maybe we did not explain well the problem. We do not want to perform any
statistical test on the data after correcting the batch effect, so we do not
need to have replicates in all the cell-lines. We would like to perform
another kind of analysis for which we need to correct the batch effect. It
happens that we have this strong "lab effect" we would like to remove but
unfortunately some of the labs only produced 1 sample and it makes ComBat
return an error. Perhaps it is not possible to apply ComBat in these
situations but we wanted to be sure before using another strategy.
Thanks so much for your kind response.
More information about the Bioconductor