[BioC] Rsubread vs. BWA, Bowtie, etc. and RPKM vs. normalized counts

Sean Davis sdavis2 at mail.nih.gov
Mon Oct 10 02:33:21 CEST 2011


Hi, Tim.

For aligning RNA-seq reads, the major questions include:

1)  Paired-end or not?
2)  Interested in junction reads?
3)  Allow indels or not?
4)  Interested in known transcripts/junctions or also interested in
novel transcripts?

If you answer those questions, you can usually limit your choices
significantly.

As for stats, consider following the DEseq or edgeR workflows.  Both
rely on raw counts and can perform exon usage statistics as well as
potentially using junction reads for similar calculations.

Sean

On Sun, Oct 9, 2011 at 8:07 PM, Tim Triche, Jr. <tim.triche at gmail.com> wrote:
> A professor sent me a bunch of raw RNA-seq reads (as FASTQ files) and I want
> to align them, and I couldn't really make heads or tails of the options, so
> I listened to what Phil Green told me at a conference and looked around for
> a sensible word-nucleated aligner like he described.  It seems that Rsubread
> works this way?
>
> http://sourceforge.net/projects/subread/
>
> I would like BAM files as intermediate output, but my real interest is
> differential exon usage in differentiating cells.  Given that the reads I
> have to align are relatively short (36bp, SE), is there an advantage or
> disadvantage in using subread compared to other options?  And when I'm done
> trimming and aligning, I could choose raw counts, conditional quantile
> normalized counts, or something like RPKM to summarize how often a given
> exon seems to have been transcribed.  I read this:
>
> http://seqanswers.com/forums/showthread.php?t=586
>
> and I see that packages using a Gamma prior for the dispersion of a Poisson
> count model benefit from having raw counts.  If I am after correlated
> changes in exon usage depending on other sequence features, is it reasonable
> to use (say) 'cqn' on the raw counts, then log-transform and work with those
> normalized counts?
>
> Thanks for any suggestions,
>
> --
> Tim Triche, Jr.
> USC Biostatistics
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list