[BioC] relations between differentially expressed genes (using DESeq) and correlation coefficient

Simon Anders anders at embl.de
Thu Jul 21 21:55:41 CEST 2011


On 2011-07-20 17:24, Elizabeth Chun wrote:
> I am using DESeq to detect genes that are differentially
> expressed (DE). I am analyzing RNA-seq data from 6 samples, each of
> which belong to different class with no biological replicates – this is
> a poor experimental design as you noted in the DESeq vignette, but this
> is all I have.
> What I am finding it to be odd is that the number of DE genes I get from
> doing the pair-wise DE detection for 6 samples does not negatively
> correlate with the Pearson or Spearman correlation coefficient that I
> calculated pairwise amongst these 6 samples (I.e. I expected that
> libraries with the higher correlation coefficient would have less number
> of DE genes. But this is not what I am seeing.).
> I am looking for your insight and was wondering if what DESeq is doing
> to detect DE genes may explain what I am observing.

As explained in the vignette, DESeq's blind mode assumes that most genes 
are not differentially expressed. It will hence call only those few 
genes as DE that show differences much large than what is seen for most 

The value of a correlation coefficient, however, will be chiefly 
dtermined by what all these many "typical" genes do that are not deemed DE.

So, while the number of DE genes reported by DESeq in blind mode is the 
number of genes deemed atypical, the correlation coefficient reflects 
the magnitude of differences between typical genes.

Clearly, these two numbers can vary quite independently, and don't need 
to show correlation.


More information about the Bioconductor mailing list