[R] bitwise addition

Tue May 16 15:28:59 CEST 2006

On Tue, 2006-05-16 at 09:45 +0200, Uwe Ligges wrote:
> Nameeta Lobo wrote:
> 
> > Hello all
> > 
> > thank you very much for all your suggestions. I actually need binary
> > representations. I tried all the methods that Marc,Jim and Charles have
> > suggested and they ran fine(thanks a lot). I tried doing it then with 26 and 13
> > and that's when the computer gave way. I just got a message with all three
> > methods that a vector of .....Kb cannot be allocated. guess I will have to
> > change the environment to allow for huge vector size allocation. How do I do that?
> 
> 
> You should have *at least* 512Mb in your machine for the solution given 
> by Charles C. Berry with the numbers given above, better a machine with 1Gb.
> 
> Uwe Ligges

In addition to Uwe's comment, there are some practical issues that will
apply here shortly if Nameeta continues to increase the size of the
source vector:

1. R has a limitation of 2^32 - 1 elements in a vector. This is the same
for both 32 and 64 bit platforms. Thus, if Nameeta is planning to
continue to expand the upper limit of the range, you will hit this
fairly quickly. You would then need to consider some form of a
partitioning approach if you go beyond that limit.

2. The RAM requirements to simply apply Charles' solution will continue
to expand as the upper limit increases, so Uwe's figure is but one
number that solves the indicated example of 2^26, but will be
insufficient beyond that.

3. This still does not address Nameeta's now explicitly stated desire
for the binary character representations, which requires additional
memory beyond that required for the initial step of identifying the
numbers that meet the 'bit requirements' alone.

>From my prior post over the weekend, to store the character matrix of
binary representations for 2^25 with 9 bits, which contained 2,042,975
values, it required approximately 128 Mb for the final paste()'d
versions of the numbers. 

That is AFTER doing the initial conversion using digitsBase(), which
required 400 Mb to store the intermediate integer matrix result.  One
could certainly do that in a partitioned or loop based approach to
conserve memory, but it still will hit practical limits in short order.

Those figures too will expand dramatically as the upper limit increases.

For example, going from 2^24 with 12 bits to 2^26 with 13 bits, results
in going from 2,704,156 values in the result to 10,400,600 in the
result. That's a 3.8 fold increase in the result vector size. It does
not take long to figure out how much memory will be required for these
operations as the upper range increases.

Depending upon what Nameeta is planning to do with the final resultant
character vectors, one could consider a loop based print method/function
that takes the values in the initial 'dec.index' vector and simply
cat()'s them to some output. However, you would not be able to actually
store them as a single matrix given the memory requirements.

Perhaps Nameeta can indicate what the primary problem is here, which
might in turn allow someone to offer an alternative approach that is
more resource sparing.

HTH,

Marc Schwartz