[Rd] gc()$Vcells < 0 (PR#9345)
vdergachev at rcgardis.com
Tue Nov 7 16:45:26 CET 2006
On Tuesday 07 November 2006 6:28 am, Prof Brian Ripley wrote:
> On Mon, 6 Nov 2006, Vladimir Dergachev wrote:
> > On Monday 06 November 2006 6:12 pm, dmaszle at mendelbio.com wrote:
> >> version.string Version 2.3.0 (2006-04-24)
> >>> x<-matrix(nrow=44000,ncol=48000)
> >>> y<-matrix(nrow=44000,ncol=48000)
> >>> z<-matrix(nrow=44000,ncol=48000)
> >>> gc()
> >> used (Mb) gc trigger (Mb) max used (Mb)
> >> Ncells 177801 9.5 407500 21.8 350000 18.7
> >> Vcells -1126881981 24170.6 NA 24173.4 NA 24170.6
> > Happens to me with versions 2.40 and 2.3.1. The culprit is this line
> > in src/main/memory.c:
> > INTEGER(value) = R_VSize - VHEAP_FREE();
> > Since the amount used is greater than 4G and INTEGER is 32bit long
> > (even on 64 bit machines) this returns (harmless) nonsense.
> That's not quite correct. The units here are Vcells (8 bytes), and
> integer() is signed, so this can happen only if more than 16Gb of heap is
I see - thank you for the explanation !
> We are aware that we begin to hit problems at 16Gb: it is for example the
> maximum size of an R vector. Those objects are logical and so about 7.8Gb
> each: their length as vectors is 98% of the maximum possible. However,
> the first time we discussed it we thought it would be about 5 years before
> those limits would become important -- I think three of those years have
> since passed.
> > The megabyte value nearby is correct and gc trigger and max used fields
> > are marked as NA already.
> and now 'used' is also marked as NA in 2.4.0 patched.
Great, thank you !
> This is only a reporting issue. When I first used R it reported only
> numbers, and I added the Mb as a more comprehensible figure (especially
> for Ncells). I think it would be sensible now to only report these
> figures in Mb or Gb (and also the reports for gcinfo(TRUE)).
Why not use KB ? This still preserves information about small allocations and
raises the limit to 16 TB - surely at least 5 years off ! :)
Alternatively, doubles should be able to hold the entire number, but this
would require changes to how information is displayed.
> The model behind the report actually pre-dates the GC change in 1.2.0.
> The 'Vcells' are nowadays the sum of all the allocations from VECSXPs
> (which include their headers), rather than the 'vector heap' (although
> some of the earlier terminology persists).
thank you !
More information about the R-devel