[Rd] dict package: dictionary data structure for R

Martin Maechler maechler at stat.math.ethz.ch
Tue Jul 24 19:32:47 CEST 2007


>>>>> "HenrikB" == Henrik Bengtsson <hb at stat.berkeley.edu>
>>>>>     on Tue, 24 Jul 2007 18:58:04 +0200 writes:

    HenrikB> On 7/23/07, Seth Falcon <sfalcon at fhcrc.org> wrote:
    >> Bill Dunlap <bill at insightful.com> writes:
    >> > With environments, if you use a prime number for the size
    >> > you get considerably better results.  E.g.,
    >> 
    >> > Perhaps new.env() should push the requested size up
    >> > to the next prime by default.
    >> 
    >> Perhaps.  I think we should also investigate other hashing functions
    >> since computing the next prime and doing so for resizes will take
    >> longer than not having to do it and it will add complexity to the
    >> code.

    HenrikB> An alternative is to hard-wiring primes within a reasonable range:

    HenrikB> http://primes.utm.edu/lists/small/millions/
    HenrikB> http://www.math.utah.edu/~pa/math/p10000.html

    HenrikB> Maybe primes close to 2^n are good enough for this problem:

    HenrikB> http://primes.utm.edu/lists/2small/

Yes, I had a similar thought....

Note that you don't need web sites for prime numbers:

my R   factorization  utilities I had mentioned a few times,
e.g., here
      http://tolstoy.newcastle.edu.au/R/help/05/01/10007.html

can give the first few hundred thousand primes quickly enough:

  > source("ftp://stat.ethz.ch/U/maechler/R/prime-numbers-fn.R")

  > system.time(PS3 <- prime.sieve(prime.sieve(prime.sieve())))
     user  system elapsed 
    0.446   0.006   0.452 

  > head(PS3, 20)
   [1]  2  3  5  7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71
  > tail(PS3, 20)
   [1] 273233 273253 273269 273271 273281 273283 273289 273311 273313 273323
  [11] 273349 273359 273367 273433 273457 273473 273503 273517 273521 273527
  > 

There are more prime / factorization utilities in that simple R
source file, but
as I say there, one should really use C code to do this;
but then R has become so fast ...

Martin Maechler, ETH Zurich

    HenrikB> Just my $.02

    HenrikB> /Henrik



More information about the R-devel mailing list