[BioC] perspective on differential expression counting - unmapped reads?

Ryan rct at thompsonclan.org
Thu Oct 31 18:49:25 CET 2013


I would not expect including including these counts to have a 
noticeable effect on the p-values of all the other genes, since edgeR 
does not normalize to the total counts, but rather uses TMM (unless you 
forgot to use calcNormFactors).

On Thu Oct 31 10:30:54 2013, Jon BR wrote:
> Hello,
>      I'm interested in calculating differential expression from some paired
> RNAseq samples.
>
> I've used htseq-count after mapping; quite happy with how easy that was.
>
> My question is with regard to whether or not to trip the last five rows
> from htseq-count output.
>
> Those rows look like this:
> no_feature 152030
> ambiguous 4876
> too_low_aQual 0
> not_aligned 0
> alignment_not_unique 0
>
> I can dream of reasons supporting either side of this question.. The number
> of unmapped or ambiguously-mapping reads do contribute to the total library
> size.  However, I'm also interested in quantifying the difference between
> what's human in both samples, so intuition would tell me to remove those
> reads.
>
> Because the counts are big, this matters a great deal.  I'm using EdgeR
> (again, very happy with that software), and the manual cites htseq-count as
> a  viable methodology, but doesn't comment on their preferred treatment of
> the unmapped reads.
>
> My first (somewhat careless) utilization of EdgeR gave us results that
> appeared to make sense, but upon digging a little deeper, I noticed that
> this question affects the p-values quite a lot because the unmapped counts
> are so big.
>
> I would appreciate any comments/opinions!
>
> Thanks,
> Jonathan
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list