[BioC] IRanges: Request for a "step" argument in runsum

Arnaud Amzallag arnaud.amzallag at gmail.com
Fri May 6 23:54:20 CEST 2011


Dear IRanges developers, 

runsum is a very fast and convenient function to compute on Rle coverages, for instance. However when it is run on several chromosomes and several samples, it can get very memory intensive. For instance on human chromosome 1, it outputs a vector of length 250 millions, so for several full genomes it is quickly billions of numbers in memory.

However, often you don't need a single base resolution. I wanted to suggest, if it is possible, to add a parameter by which one could have the sliding window to slide by a user defined step, rather than always "step=1", as it is now. Such that runsum(myRle, k=1e4, step = 1000) would return the equivalent of a wig file, for each 10 kilobases of the genome, without saturating the memory of the server.

I tried with sum(Views(myRle, ir)), it is less memory intensive but it is much slower. So that amelioration would give the best of both worlds, fast and memory efficient.

kind regards,

Arnaud Amzallag
Research Fellow
Mass general Cancer Center / Harvard Medical school


More information about the Bioconductor mailing list