[BioC] Rsamtools: Realloc integer overflow?

Martin Morgan mtmorgan at fhcrc.org
Tue Jun 4 03:26:13 CEST 2013

On 06/03/2013 05:27 PM, Michael Lawrence wrote:
> Hey guys,
> Whenever I try to calculate the coverage for a BAM file with more than say
> 500 million reads, I get this error:
> Error in coverage(readBamGappedAlignments(x, param = param), shift =
> shift,  : \n  error in evaluating the argument 'x' in selecting a method
> for function 'coverage': Error in value[[3L]](cond) (from #2) : \n
> 'Realloc' could not re-allocate memory (18446744065128005632 bytes)\n
> This looks like integer overflow, possibly within _grow_SCAN_BAM_DATA().
> Could we just use long there?

I wonder if it would be more sensible if less convenient to do this (under 

   bf <- open(BamFile(fl, yieldSize=100000000))
   cvg <- coverage(readGAlignmentsFromBam(bf))
   while (length(aln <- readGAlignmentsFromBam(bf)))
       cvg <- cvg + coverage(aln)

? It opens the door for better memory management and parallel evaluation.

I'm concerned that using size_t (Realloc casts to this) or ptrdiff_t (the size 
of R long vectors) would only get us through the C code; the representation of 
this in R would require R long vectors, and Rsamtools does not (yet?) support that.


> Michael
> 	[[alternative HTML version deleted]]
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

More information about the Bioconductor mailing list