[BioC] Recommended gene model for DESeq

Assaf Gordon gordon at cshl.edu
Fri Apr 6 23:24:56 CEST 2012


Thank you all for your responses.

I'm still looking for the optimal way to count hits for RNA-Seq Paired-end data,  may I ask for couple of clarifications?

Simon Anders wrote, On 04/05/2012 05:11 AM:
> You should make sure that each read is counted only once per gene.

Once per gene - got it.
What about a case where a read matches multiple genes? (described as "ambiguous" in HTSeq-Count/GenomicRanges "modes")
Is it OK to count this read several times (once for each gene, multiple different genes), or would that invalidate the results?

It seems "easyRNASeq" will count a read multiple times (once per gene) when using "geneModels" summarization mode (based on [1], page 10) - so can it be used?


> If you want to stay in R: Valerie Obenchain has recently added functionality to GenomicRanges to perform counting in a way similar to my htseq-count script

Related question: handling paired-end data correctly.
It seems only GenomicRanges does not handle paired-end reads at all (based on [2], page 2, section "3. counting mode") - so the only option is "htseq-count" - is that correct ?
I also couldn't find any mention of paired-end data in "easyRNASeq" PDF [1], so I don't know if it handles that or not.


Thanks,
 -gordon



[1] http://bioconductor.org/packages/release/bioc/vignettes/easyRNASeq/inst/doc/easyRNASeq.pdf
[2] http://bioconductor.org/packages/release/bioc/vignettes/GenomicRanges/inst/doc/summarizeOverlaps-modes.pdf



More information about the Bioconductor mailing list