[R] Summary of why R has the 2^31-1 limit?

Mon Jun 29 12:00:30 CEST 2009

I know it has been discussed before, but is there anywhere a good 
summary of (1) why R has the 2^31-1 vector length limit on all platforms 
(specifically 64-bit, of course) and (b) what would be the 
effort/implications of changing it?  I think I have seen it but I 
couldn't find it and it does not seem to be in the FAQ.

It is *really* annoying me now :(  My code is littered with 'if 
(prod(dim(...)) < 2^31) {do.something.reasonable()} else 
{custom.workaround.for.this.package.and.this.problem.at.this.time.and.please.just.shoot.me.now()}'.

The trouble seems to be that (a) many routines rely on converting 
data.frame (or similar) objects to matrix objects and (b) matrix objects 
are stored as a vector.  Possible fixes would naively seem to be (a) 
change the storage of matrix objects which sounds like a major PITA or 
(b) change the index of a vector to a long int or similar which might be 
a problem for interoperability (e.g. save()) and possibly for underlying 
libraries but seems more manageable even if it has to be a fork 
(Q-project, anyone, or should we get a decent name this time?) and even 
if there are probably internal horrors around that make it harder than I 
think it would be.

I am tired of writing C code for the *only* reason that R has this 
stupid (you wouldn't implement it like that if you had to start again, 
would you?) limitation.

Allan.