[R] bitwise addition

Charles C. Berry cberry at tajo.ucsd.edu
Mon May 15 18:50:08 CEST 2006


Here is a solution that finds the 2042975 25-bit words with 9 bits 'on'
in under 5 seconds on my PC. It finds the 5200300 25-bit words with 12 
bits 'on' in 16 seconds.

Perhaps, this is the solution for a combinatorics homework problem.

Finding words with 9 bits on:

> hi.index <- 25
> n.index <- 9
> breaks <- lapply(1:n.index,function(x) choose(-1+1:hi.index,x))
> index <- seq(0,length=choose(hi.index,n.index))
> dec.index <- rep(as.integer(0),choose(hi.index,n.index))
> system.time(
+             for (i in n.index:1){
+               pos <- findInterval(index,breaks[[i]])
+               index <- index - breaks[[i]][pos]
+               dec.index <- dec.index + 2^(pos-1)
+             }
+             )
[1] 4.30 0.48 4.78   NA   NA
> dec.index[1:10] # decimal representation of first ten
  [1]  511  767  895  959  991 1007 1015 1019 1021 1022
> gc()
           used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  180950  4.9     407500  10.9   350000   9.4
Vcells 5180889 39.6   14895582 113.7 15399445 117.5

And finding words with 12 bits on:

> hi.index <- 25
> n.index <- 12
> breaks <- lapply(1:n.index,function(x) choose(-1+1:hi.index,x))
> index <- seq(0,length=choose(hi.index,n.index))
dec.index <- rep(as.integer(0),choose(hi.index,n.index))
> > system.time(
+             for (i in n.index:1){
+               pos <- findInterval(index,breaks[[i]])
+               index <- index - breaks[[i]][pos]
+               dec.index <- dec.index + 2^(pos-1)
+             }
+             )

[1] 13.97  2.36 16.33    NA    NA
dec.index[1:10]  # decimal representation of first ten
>  [1] 4095 6143 7167 7679 7935 8063 8127 8159 8175 8183
> gc()
            used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   180953  4.9     407500  10.9   350000   9.4
Vcells 13074276 99.8   29645052 226.2 28675324 218.8


Chuck Berry


On Sun, 14 May 2006, Charles C. Berry wrote:

> On Sat, 13 May 2006, jim holtman wrote:
>
>> Using the algorithm in the reference
>> www.everything2.com/index.pl?node_id=449702 and the 'bitops' package, here
>> is some code that will select the numeric values with a specified number of
>> bits.  It takes less than 12 seconds to select from 21-bit numbers. If I had
>> more memory, I would expect that for 25-bit numbers it would take about 200
>> seconds to compute the indices.
>>
>
> The approach on the 'everything2' page is decidely cool.
>
> Since a fixed number of bits is sought one can operate on the indices of
> the bits, rather than the words and bits per se.
>
> For example, selecting 2 bits from 4 bit words:
>
> res <- rep(as.double(0),choose(4,2))
> k <- 0
> for ( index.1 in (1):(3) )
>   for ( index.2 in (index.1+1):(4) ){
>     index <- c( index.1,index.2 )
>     k <- k + 1
>     res[k] <- sum( 2^(index-1) ) }
>
> gives
>> res
> [1]  3  5  9  6 10 12
>
> Of course, one could turn 'index' into a character representation of the
> bits instead of a decimal representation, if that was desired.
>
>
> To generalize this a little computing on the language helps. For 25 bit
> words choosing those with 9 bits turned on:
>
> ##-----
> hi.index <- 25
> n.index <- 9
> for.template <- "for ( index.#X# in (#FROM#):(#TO#) )"
> for.text <- function(x)
>   { from.text <- if (x==1) "1" else paste("index.",x-1,"+1",sep="")
>     to.text <- as.character(hi.index-n.index+x)
>     sub("#X#",x,sub("#TO#",to.text,sub("#FROM#",from.text,for.template)))
>   }
>
>
> all.text <- paste( "k <- 0;",
>                   paste( sapply(1:n.index,for.text),collapse=" "),
>                   "{index <- c(",
>                   paste(sapply(1:n.index,function(x)
> paste("index",x,sep='.')),collapse=","),
>                   ");k <- k + 1; res[k] <- sum( 2^(index-1) ) }")
> res <- rep(as.double(0),choose(hi.index,n.index))
>
> system.time(eval( parse(text=all.text ) ))
>
> ##-----
>
> On my 2GHz AMD 64 running on Windows XP, I get
>
>> system.time(eval( parse(text=all.text ) ))
> [1] 35.17  0.08 35.41    NA    NA
>> res[1:10]
>  [1]    511    767   1279   2303   4351   8447  16639  33023  65791 131327
>
> Note that if ordered binary numbers are needed, an additional sort()
> is required - adding about 1/10 seconds:
>
>> system.time(print(sort(res)[1:10]))
>  [1]  511  767  895  959  991 1007 1015 1019 1021 1022
> [1] 0.11 0.00 0.11   NA   NA
>>
>
> The memory needed is modest, as you would expect:
>
>> gc()
>           used (Mb) gc trigger (Mb) max used (Mb)
> Ncells  178596  4.8     407500 10.9   407500 10.9
> Vcells 2115922 16.2    5621217 42.9  4397614 33.6
>>
>
> Running this using
>
>
> text.bits <- function(x) {
> 	tmp <- rep("0",hi.index)
> 	tmp[hi.index+1-x] <- "1";paste(tmp,collapse="")}
>
> res <- rep("0000000000000000111111111",choose(hi.index,n.index))
>
> and substituting
>
> 	res[k] <- text.bits(index)
>
> in place of "res[k] <- sum( 2^(index-1) )" gives:
>
>> system.time(eval( parse(text=all.text ) ))
> [1] 201.31   0.12 204.44     NA     NA
>> res[1:10]
>  [1] "0000000000000000111111111" "0000000000000001011111111"
>  [3] "0000000000000010011111111" "0000000000000100011111111"
>  [5] "0000000000001000011111111" "0000000000010000011111111"
>  [7] "0000000000100000011111111" "0000000001000000011111111"
>  [9] "0000000010000000011111111" "0000000100000000011111111"
>>
>> gc()
>           used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 2221730 59.4    3307833 88.4  3307833 88.4
> Vcells 9266414 70.7   11894938 90.8 12374438 94.5
>>
>
> And running this with the decimal representation for 'res' for 25 bit
> words choosing those with 12 bits turned on:
>
>> hi.index <- 25
>> n.index <- 12
>   .
>   .
>   .
>> system.time(eval( parse(text=all.text ) ))
> [1] 101.78   0.08 104.14     NA     NA
>
>
> Chuck
>
> [rest deleted]
>
>
> Charles C. Berry                        (858) 534-2098
>                                          Dept of Family/Preventive Medicine
> E mailto:cberry at tajo.ucsd.edu	         UC San Diego
> http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0717
>
>
>
>    [ Part 3.15: "Included Message" ]
>

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0717




More information about the R-help mailing list