[BioC] Integer overflow when summing an 'integer' Rle

Nicolas Delhomme delhomme at embl.de
Fri Feb 10 17:04:19 CET 2012


Hi all,

While calculating some statistics of an RNA-seq experiment I tumbled onto the following problem. Applying the IRanges coverage function to my IRanges, I get back an integer Rle object. However trying to get the mean or sum of that Rle object results in an integer overflow. The following example just exemplify that overflow.

library(IRanges)
rC <- Rle(values=as.integer(c(1,(2^31)-1,1)))
sum(rC)
mean(rC)

Both result in an integer overflow. 

[1] NA
Warning message:
In sum(runValue(x) * runLength(x), ..., na.rm = na.rm) :
  Integer overflow - use sum(as.numeric(.))

The solution to  that is to do the following:

sum(as.numeric(runLength(rC) * runValue(rC)))

but IMO it should be handled at the Rle level code; i.e. an integer Rle can clearly have a sum, a mean, etc... result that involve calculating values outside the integer range. Is there anything that speaks again having these functions internally converting the integer values to numeric before calculating the sum or mean?

Looking forward to hearing your thoughts on this,

Cheers,

Nico

sessionInfo()
R Under development (unstable) (2012-02-07 r58290)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] IRanges_1.13.24    BiocGenerics_0.1.4

loaded via a namespace (and not attached):
[1] tools_2.15.0



---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany



More information about the Bioconductor mailing list