[R] large integers in R

Thu Jan 28 12:39:29 CET 2010

On Thu, Jan 28, 2010 at 11:21 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> On 28/01/2010 5:30 AM, Benilton Carvalho wrote:
>>
>> Hi Duncan,
>>
>> On Tue, Jan 26, 2010 at 9:09 PM, Duncan Murdoch <murdoch at stats.uwo.ca>
>> wrote:
>>>
>>> On 26/01/2010 3:25 PM, Blanford, Glenn wrote:
>>>>
>>>> Has there been any update on R's handling large integers greater than
>>>> 10^9
>>>> (between 10^9 and 4x10^9) ?
>>>>
>>>> as.integer() in R 2.9.2 lists this as a restriction but doesnt list the
>>>> actual limit or cause, nor if anyone was looking at fixing it.
>>>
>>> Integers in R are 4 byte signed integers, so the upper limit is 2^31-1.
>>>  That's not likely to change soon.
>>
>> But in the hypothetical scenario that this was to change soon and we
>> were to have 64bit integer type (say, when under a 64 bit OS),
>> wouldn't this allow us to have objects whose length exceeded the
>> 2^31-1 limit?
>
> Those are certainly related problems, but you don't need 64 bit integers to
> have longer vectors.  We could switch to indexing by doubles in R (though
> internally the indexing would probably be done in 64 bit ints).
>
> A problem with exposing 64 bit ints in R is that they break the rule that
> doubles can represent any integer exactly.  If x is an integer, x+1 is a
> double, and it would be unfortunate if (x+1) != (x+1L), as will happen with
> values bigger than 2^52.

I see... thanks for the clarification. I'm sure that changes like this
would bring several side effects, but regardless how they're done I
believe they would benefit the whole community.

As an example, with a BioC package I wrote, we can "only" analyze
roughly 2K samples on a given microarray platform (despite the fact
that we have way more than 2K samples). In the meantime, we have to
use other tricks to work around the 2^31-1 limit.

Again, thank you very much,

Benilton Carvalho

>
> Duncan Murdoch
>
>
>>
>>
>> Benilton Carvalho
>>
>>
>>
>>
>>> The double type in R can hold exact integer values up to around 2^52. So
>>> for
>>> example calculations like this work fine:
>>>
>>>> x <- 2^50
>>>> y <- x + 1
>>>> y-x
>>>
>>> [1] 1
>>>
>>> Just don't ask R to put those values into a 4 byte integer, they won't
>>> fit:
>>>
>>>> as.integer(c(x,y))
>>>
>>> [1] NA NA
>>> Warning message:
>>> NAs introduced by coercion
>>>
>>> Duncan Murdoch
>>>
>>>> Glenn D Blanford, PhD
>>>> <mailto:glenn.blanford at us.army.mil>
>>>> Scientific Research Corporation
>>>> gblanford at scires.com<mailto:gblanford at scires.com>
>>>>
>>>>
>>>>       [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
>