[BioC] Different number of genes when using HTseq and cuffdiff

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Dec 14 15:41:32 CET 2012


Hi,

On Fri, Dec 14, 2012 at 9:05 AM, Fatemehsadat Seyednasrollah
<fatsey at utu.fi> wrote:
> Hi,
>
>  I have used the same output tophat bam files both for HTseq (and then DESeq) and cuffdiff to find DE genes. But  I do not understand why even when the bam files and references are the same the number of genes are different in the result of cuffdiff and HTseq. Actually I expected to have different number of counts for each gene but not getting more (nearly 100) number of genes in HTseq comparing to cuffdiff.

What do you mean by "more genes"? You mean more genes are called as
differentially expressed? Or is there some pipeline that you are using
to just count reads over genes, and these two pipelines are giving
different number of genes as "input"?

If it's the former -- cuffdiff and DESeq do rather different things to
assess differential expression, and so your result should not be a
surprise.

While I haven't actually read the paper, I would imagine the new
publication on cuffdiff2 would be rather informative in this regard:

http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.2450.html

You haven't said what version of each software you are using, but I
guess you're using cuffdiff 2?

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list