[BioC] summarizeOverlaps mode ignoring inter feature overlaps

Ryan C. Thompson rct at thompsonclan.org
Tue Apr 9 09:07:43 CEST 2013


Memory usage is addressed by using the yieldSize argument to the 
BamFile(List) function to read only a small number of reads (or read 
pairs) at a time.

On Mon 08 Apr 2013 05:52:21 PM PDT, Thomas Girke wrote:
> Dear Valerie,
>
> Is there currently any way to run summarizeOverlaps in a feature-overlap
> unaware mode, e.g with an ignorefeatureOL=FALSE/TRUE setting? Currently,
> one can switch back to countOverlaps when feature overlap unawareness is
> the more appropriate counting mode for a biological question, but then
> double counting of reads mapping to multiple-range features is not
> accounted for. It would be really nice to have such a feature-overlap
> unaware option directly in summarizeOverlaps.
>
> Another question relates to the memory usage of summarizeOverlaps. Has
> this been optimized yet? On a typical bam file with ~50-100 million
> reads the memory usage of summarizeOverlaps is often around 10-20GB. To
> use the function on a desktop computer or in large-scale RNA-Seq
> projects on a commodity compute cluster, it would be desirable if every
> counting instance would consume not more than 5GB of RAM.
>
> Thanks in advance for your help and suggestions,
>
> Thomas
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list