[BioC] RNA-seq differentially expressed gene finding methods

Fri Sep 5 19:05:10 CEST 2014

N.B., I forgot to CC the list originally.

Hi Son,

To add a bit to Richard's response, there's also the issue that conversion to FPKM/RPKM/TPM loses precision information. For example, suppose two samples in a group produce values of 1.0 and 1.2 for some gene (these can be any of the aforementioned metrics). It's rarely the case that the number of mapped reads (or even those aligning to genes) is constant across samples, so it's quite likely that one of those numbers was derived from more data than the other, meaning that we'd like to weight estimates of the group measure toward it. That'd be impossible with only FPKM/etc. values, since we lose this information.

Best,
Devon
____________________________________________
Devon Ryan, Ph.D.
Email: dpryan at dpryan.com
Tel: +49 (0)178 298-6067
Molecular and Cellular Cognition Lab
German Centre for Neurodegenerative Diseases (DZNE)
Ludwig-Erhard-Allee 2
53175 Bonn, Germany

On Sep 5, 2014, at 6:44 PM, Son Pham wrote:

> Dear all,
> I know that we have quite very good packages (edgeR, deseq) that calculate
> the list of differentially expressed genes in 2 conditions (with
> replicates) from raw counts. But I do not know what is wrong with the
> following simple approach (and whether other people have been using it):
> 
> 1. Get the (estimated) tpm/fpkm for each gene in each sample
> 2. Do a t-test for two groups on each gene.
> 3. Adjust the p value for multiple tests (p-adj)
> 
> 
> Thanks,
> 
> Son.
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor