[BioC] Remove batch effect in small RNASeq study (SVA or others?)

shirley zhang shirley0818 at gmail.com
Mon Apr 28 02:32:55 CEST 2014


Dear Dr. Smyth,

Thank you very much for your quick reply. I did as you suggested by first
getting log CPM value, then call removeBatchEffect(). I found the PCA
figure looks better than before, but there is still a batch effect.

I attached two PCA figures. One is based on log10(raw count) which is
before calling cpm() and removeBatchEffect(). Another one is after.

Could you look at them and give me more suggestions. Will a quantile
normalization across samples be a good idea since CPM() is still a
normalization only within each sample??

Thanks again for your help,
Shirley


On Sun, Apr 27, 2014 at 6:54 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:

> Dear Shirley,
>
> I would probably do it like this:
>
>   library(edgeR)
>   logCPM <- cpm(y,log=TRUE,prior.count=5)
>   logCPM <- removeBatchEffect(logCPM, batch=batch)
>
> Best wishes
> Gordon
>
>  Date: Sat, 26 Apr 2014 10:51:23 -0400
>> From: shirley zhang <shirley0818 at gmail.com>
>> To: Bioconductor Mailing List <bioconductor at stat.math.ethz.ch>
>> Subject: [BioC] Remove batch effect in small RNASeq study (SVA or
>>         others?)
>>
>> I have a RNASeq paired-end data from two batches (8 samples from batch1,
>> and 7 samples from batch2). After alignment using TopHat2, then I got
>> count
>> using HTseq-count, and FPKM value via Cufflinks. A big batch effect can be
>> viewed in PCA using both log10(raw count) and log10(FPKM) value.
>>
>> I can NOT use the block factor in edgeR to remove batch effect since I
>> need
>> to first obtain residuals after adjusting for batch effect, then test the
>> residuals for hundreds of thousands of SNPs (eQTL analysis).
>>
>> My question is how to remove batch effect without using edgeR:
>>
>> 1. is SVA ok for such a small sample size (N=15)?
>> 2. If SVA does not work, any other suggestions?
>>
>> Many thanks,
>> Shirley
>>
>
> ______________________________________________________________________
> The information in this email is confidential and intended solely for the
> addressee.
> You must not disclose, forward, print or use it without the permission of
> the sender.
> ______________________________________________________________________
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: all.count.log10.pca.pdf
Type: application/pdf
Size: 5034 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140427/6300fe9d/attachment.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: all.count.logCPM.rmBatch.pca.pdf
Type: application/pdf
Size: 5120 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140427/6300fe9d/attachment-0001.pdf>


More information about the Bioconductor mailing list