[BioC] zero rna-seq values AFTER normalisation in edgeR

Gordon K Smyth smyth at wehi.EDU.AU
Sun Aug 17 02:24:54 CEST 2014

Dear Nick N,

Thanks for using edgeR.  You do have misunderstandings however about how 
normalization works and what is output by the cpm() function.

> Date: Fri, 15 Aug 2014 14:23:09 +0100
> From: Nick N <feralmedic at gmail.com>
> To: bioconductor at r-project.org
> Subject: [BioC] zero rna-seq values AFTER normalisation in edgeR
> I am using edgeR to analyze RNA-Seq data. This is my script:
> library("edgeR")


> d <- calcNormFactors(d)
> all_cpm=cpm(d, normalized.lib.size=TRUE)


> I believe that the variable "all_counts" shall contain the normalized
> counts for each sample in each condition.

The cpm() function simply computes counts-per-million, which is a 
ratio rather than a count.

> My understanding is also that edgeR adds pseudocounts BEFORE performing 
> the library normalisation.

No it doesn't.  Why would you think that?  edgeR works with your data as 
it actually is rather than trying to fudge it.

> Thus it is possible that some values revert to being zero after 
> normalisation. But I thought that this would happen rarely. Yet in a 
> recent dataset I find an improbably large number of zero values in 
> "all_counts" which made me think that my understanding of how 
> pseudocounts and normalisation work in edgeR might be incorrect. Can, 
> please, somebody comment on this?

cpm() simply computes counts per million by dividing the counts by the 
normalized library sizes.  Obviously a zero count corresponds to a zero 
count-per-million.  That seems pretty natural!

Are you perhaps thinking of the use of prior.counts when computing cpm or 
logFC on the log-scale?  The help page for the cpm() function tells you 
that prior counts are not used when computing plain cpm values on the raw 

I wonder what source you are relying on for information about edgeR?  The 
most reliable source is the documentation that comes with edgeR.

Best wishes

The information in this email is confidential and intend...{{dropped:4}}

More information about the Bioconductor mailing list