[R] bitwise addition

Nameeta Lobo nlobo at bsd.uchicago.edu
Mon May 15 21:11:56 CEST 2006


Hello all

thank you very much for all your suggestions. I actually need binary
representations. I tried all the methods that Marc,Jim and Charles have
suggested and they ran fine(thanks a lot). I tried doing it then with 26 and 13
and that's when the computer gave way. I just got a message with all three
methods that a vector of .....Kb cannot be allocated. guess I will have to
change the environment to allow for huge vector size allocation. How do I do that?

Thanks once again all of you.

Nameeta


Quoting "Charles C. Berry" <cberry at tajo.ucsd.edu>:

> 
> Here is a solution that finds the 2042975 25-bit words with 9 bits 'on'
> in under 5 seconds on my PC. It finds the 5200300 25-bit words with 12 
> bits 'on' in 16 seconds.
> 
> Perhaps, this is the solution for a combinatorics homework problem.
> 
> Finding words with 9 bits on:
> 
> > hi.index <- 25
> > n.index <- 9
> > breaks <- lapply(1:n.index,function(x) choose(-1+1:hi.index,x))
> > index <- seq(0,length=choose(hi.index,n.index))
> > dec.index <- rep(as.integer(0),choose(hi.index,n.index))
> > system.time(
> +             for (i in n.index:1){
> +               pos <- findInterval(index,breaks[[i]])
> +               index <- index - breaks[[i]][pos]
> +               dec.index <- dec.index + 2^(pos-1)
> +             }
> +             )
> [1] 4.30 0.48 4.78   NA   NA
> > dec.index[1:10] # decimal representation of first ten
>   [1]  511  767  895  959  991 1007 1015 1019 1021 1022
> > gc()
>            used (Mb) gc trigger  (Mb) max used  (Mb)
> Ncells  180950  4.9     407500  10.9   350000   9.4
> Vcells 5180889 39.6   14895582 113.7 15399445 117.5
> 
> And finding words with 12 bits on:
> 
> > hi.index <- 25
> > n.index <- 12
> > breaks <- lapply(1:n.index,function(x) choose(-1+1:hi.index,x))
> > index <- seq(0,length=choose(hi.index,n.index))
> dec.index <- rep(as.integer(0),choose(hi.index,n.index))
> > > system.time(
> +             for (i in n.index:1){
> +               pos <- findInterval(index,breaks[[i]])
> +               index <- index - breaks[[i]][pos]
> +               dec.index <- dec.index + 2^(pos-1)
> +             }
> +             )
> 
> [1] 13.97  2.36 16.33    NA    NA
> dec.index[1:10]  # decimal representation of first ten
> >  [1] 4095 6143 7167 7679 7935 8063 8127 8159 8175 8183
> > gc()
>             used (Mb) gc trigger  (Mb) max used  (Mb)
> Ncells   180953  4.9     407500  10.9   350000   9.4
> Vcells 13074276 99.8   29645052 226.2 28675324 218.8
> 
> 
> Chuck Berry
> 
> 
> On Sun, 14 May 2006, Charles C. Berry wrote:
> 
> > On Sat, 13 May 2006, jim holtman wrote:
> >
> >> Using the algorithm in the reference
> >> www.everything2.com/index.pl?node_id=449702 and the 'bitops' package,
> here
> >> is some code that will select the numeric values with a specified number
> of
> >> bits.  It takes less than 12 seconds to select from 21-bit numbers. If I
> had
> >> more memory, I would expect that for 25-bit numbers it would take about
> 200
> >> seconds to compute the indices.
> >>
> >
> > The approach on the 'everything2' page is decidely cool.
> >
> > Since a fixed number of bits is sought one can operate on the indices of
> > the bits, rather than the words and bits per se.
> >
> > For example, selecting 2 bits from 4 bit words:
> >
> > res <- rep(as.double(0),choose(4,2))
> > k <- 0
> > for ( index.1 in (1):(3) )
> >   for ( index.2 in (index.1+1):(4) ){
> >     index <- c( index.1,index.2 )
> >     k <- k + 1
> >     res[k] <- sum( 2^(index-1) ) }
> >
> > gives
> >> res
> > [1]  3  5  9  6 10 12
> >
> > Of course, one could turn 'index' into a character representation of the
> > bits instead of a decimal representation, if that was desired.
> >
> >
> > To generalize this a little computing on the language helps. For 25 bit
> > words choosing those with 9 bits turned on:
> >
> > ##-----
> > hi.index <- 25
> > n.index <- 9
> > for.template <- "for ( index.#X# in (#FROM#):(#TO#) )"
> > for.text <- function(x)
> >   { from.text <- if (x==1) "1" else paste("index.",x-1,"+1",sep="")
> >     to.text <- as.character(hi.index-n.index+x)
> >     sub("#X#",x,sub("#TO#",to.text,sub("#FROM#",from.text,for.template)))
> >   }
> >
> >
> > all.text <- paste( "k <- 0;",
> >                   paste( sapply(1:n.index,for.text),collapse=" "),
> >                   "{index <- c(",
> >                   paste(sapply(1:n.index,function(x)
> > paste("index",x,sep='.')),collapse=","),
> >                   ");k <- k + 1; res[k] <- sum( 2^(index-1) ) }")
> > res <- rep(as.double(0),choose(hi.index,n.index))
> >
> > system.time(eval( parse(text=all.text ) ))
> >
> > ##-----
> >
> > On my 2GHz AMD 64 running on Windows XP, I get
> >
> >> system.time(eval( parse(text=all.text ) ))
> > [1] 35.17  0.08 35.41    NA    NA
> >> res[1:10]
> >  [1]    511    767   1279   2303   4351   8447  16639  33023  65791
> 131327
> >
> > Note that if ordered binary numbers are needed, an additional sort()
> > is required - adding about 1/10 seconds:
> >
> >> system.time(print(sort(res)[1:10]))
> >  [1]  511  767  895  959  991 1007 1015 1019 1021 1022
> > [1] 0.11 0.00 0.11   NA   NA
> >>
> >
> > The memory needed is modest, as you would expect:
> >
> >> gc()
> >           used (Mb) gc trigger (Mb) max used (Mb)
> > Ncells  178596  4.8     407500 10.9   407500 10.9
> > Vcells 2115922 16.2    5621217 42.9  4397614 33.6
> >>
> >
> > Running this using
> >
> >
> > text.bits <- function(x) {
> > 	tmp <- rep("0",hi.index)
> > 	tmp[hi.index+1-x] <- "1";paste(tmp,collapse="")}
> >
> > res <- rep("0000000000000000111111111",choose(hi.index,n.index))
> >
> > and substituting
> >
> > 	res[k] <- text.bits(index)
> >
> > in place of "res[k] <- sum( 2^(index-1) )" gives:
> >
> >> system.time(eval( parse(text=all.text ) ))
> > [1] 201.31   0.12 204.44     NA     NA
> >> res[1:10]
> >  [1] "0000000000000000111111111" "0000000000000001011111111"
> >  [3] "0000000000000010011111111" "0000000000000100011111111"
> >  [5] "0000000000001000011111111" "0000000000010000011111111"
> >  [7] "0000000000100000011111111" "0000000001000000011111111"
> >  [9] "0000000010000000011111111" "0000000100000000011111111"
> >>
> >> gc()
> >           used (Mb) gc trigger (Mb) max used (Mb)
> > Ncells 2221730 59.4    3307833 88.4  3307833 88.4
> > Vcells 9266414 70.7   11894938 90.8 12374438 94.5
> >>
> >
> > And running this with the decimal representation for 'res' for 25 bit
> > words choosing those with 12 bits turned on:
> >
> >> hi.index <- 25
> >> n.index <- 12
> >   .
> >   .
> >   .
> >> system.time(eval( parse(text=all.text ) ))
> > [1] 101.78   0.08 104.14     NA     NA
> >
> >
> > Chuck
> >
> > [rest deleted]
> >
> >
> > Charles C. Berry                        (858) 534-2098
> >                                          Dept of Family/Preventive
> Medicine
> > E mailto:cberry at tajo.ucsd.edu	         UC San Diego
> > http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0717
> >
> >
> >
> >    [ Part 3.15: "Included Message" ]
> >
> 
> Charles C. Berry                        (858) 534-2098
>                                           Dept of Family/Preventive
> Medicine
> E mailto:cberry at tajo.ucsd.edu	         UC San Diego
> http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0717
> 
> 
> 
> 
> 





-------------------------------------------------
This email is intended only for the use of the individual or...{{dropped}}




More information about the R-help mailing list