[BioC] EdgeR: getting CPM values after batch effect correction

Wed Aug 28 10:02:13 CEST 2013

Hi Gilgi,

If you're just using edgeR for the CPM calculation, I believe there's no 
need to estimate dispersions; they won't be used in calculating the CPM 
values. You may wish to use normalized.lib.sizes=TRUE in the call to cpm 
to get CPM values that take into account the normalization factors 
computed by calcNormFactors (otherwise there's no point in doing that 
either).

Second, according to the help text for removeBatchEffect, you're not 
supposed to include the batch effect in the design matrix. The design 
matrix should include all the experimental variables, while the batch 
variable should indicate the technical batching.

Finally, instead of doing a simple logCPM transform, you might also try 
the variance-stabilizing transformation provided by the DESeq package, 
which is intended for clustering and machine learning types of analyses.

-Ryan

On 08/27/2013 10:52 PM, Gilgi Friedlander wrote:
> Hi Ryan,
>
> Thanks a lot for the reply.
>
> I followed EdgeR user's manual, and defined a model:
> y1<-DGEList(counts=countdata1,group=batch)
> y1<- calcNormFactors( y1 )
> design1 <- model.matrix(~batch+Treat1)
>
> batch has values 1 or 2, according to the batch of the experiment that was done.
>
> Treat has 10 different samples.
>
> In order to define a contrast I did the following:
> y1 <- estimateGLMCommonDisp(y1, design1, verbose=TRUE)#Now we are ready to construct an edgeR specific
> y1 <- estimateGLMTrendedDisp(y1, design1)
> y1 <- estimateGLMTagwiseDisp(y1, design1)
> lrt <- glmLRT(fit,contrast=c(0,0,1,-1,0,0,0,0,0,0,0,0,0,0))
>
> I want now to get also the log counts after removal of the batch effect (for the purpose of clustering of the genes).
>
> Is it correct to obtain the batch removed log counts in the following way:
>
> logCPM <- cpm(y1, log=TRUE, prior.count=3)
> logCPM_batchRemoved<-removeBatchEffect(logCPM,batch=batch,design=design1)
>
> Many thanks,
> Gilgi
>
> -----Original Message-----
> From: Ryan [mailto:rct at thompsonclan.org]
> Sent: Wednesday, August 28, 2013 2:46 AM
> To: Gilgi Friedlander
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] EdgeR: getting CPM values after batch effect correction
>
> You would have to define batch correction. Are you talking about fitting a model of the form "~ experimetalVar + batchEffect" and then subtracting out the batch effect coefficient?
>
> On Sun Aug 25 11:11:55 2013, Gilgi Friedlander wrote:
>> Dear list,
>>
>> In edgeR, it possible to get CPM values after batch effect correction (and after TMM normalization)?
>>
>> Thanks a lot!
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor