# [Rd] dict package: dictionary data structure for R

Martin Maechler maechler at stat.math.ethz.ch
Tue Jul 24 19:32:47 CEST 2007

```>>>>> "HenrikB" == Henrik Bengtsson <hb at stat.berkeley.edu>
>>>>>     on Tue, 24 Jul 2007 18:58:04 +0200 writes:

HenrikB> On 7/23/07, Seth Falcon <sfalcon at fhcrc.org> wrote:
>> Bill Dunlap <bill at insightful.com> writes:
>> > With environments, if you use a prime number for the size
>> > you get considerably better results.  E.g.,
>>
>> > Perhaps new.env() should push the requested size up
>> > to the next prime by default.
>>
>> Perhaps.  I think we should also investigate other hashing functions
>> since computing the next prime and doing so for resizes will take
>> longer than not having to do it and it will add complexity to the
>> code.

HenrikB> An alternative is to hard-wiring primes within a reasonable range:

HenrikB> http://primes.utm.edu/lists/small/millions/
HenrikB> http://www.math.utah.edu/~pa/math/p10000.html

HenrikB> Maybe primes close to 2^n are good enough for this problem:

HenrikB> http://primes.utm.edu/lists/2small/

Yes, I had a similar thought....

Note that you don't need web sites for prime numbers:

my R   factorization  utilities I had mentioned a few times,
e.g., here
http://tolstoy.newcastle.edu.au/R/help/05/01/10007.html

can give the first few hundred thousand primes quickly enough:

> source("ftp://stat.ethz.ch/U/maechler/R/prime-numbers-fn.R")

> system.time(PS3 <- prime.sieve(prime.sieve(prime.sieve())))
user  system elapsed
0.446   0.006   0.452

[1]  2  3  5  7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71
> tail(PS3, 20)
[1] 273233 273253 273269 273271 273281 273283 273289 273311 273313 273323
[11] 273349 273359 273367 273433 273457 273473 273503 273517 273521 273527
>

There are more prime / factorization utilities in that simple R
source file, but
as I say there, one should really use C code to do this;
but then R has become so fast ...

Martin Maechler, ETH Zurich

HenrikB> Just my \$.02

HenrikB> /Henrik

```