[R] Dynamic Dictionary Data Type?

Kjetil Brinchmann Halvorsen kjetil at acelerate.com
Thu Jun 2 20:14:59 CEST 2005


Prof Brian Ripley wrote:

> On Thu, 2 Jun 2005, hadley wickham wrote:
>
>>> An environment is a hash table, and copying occurs only on 
>>> modification,
>>> when any language would have to copy in this context.
>>
>
> Caveat: the default is new.env(hash=FALSE), so an environment is a 
> hash table in one sense but not necessarily so in the strict sense.
>
>> Yes, I'm aware that copying only occurs on modification.  However, it
>> was my understanding that
>>
>> a <- list(a =1)
>> a$b <- 4
>>
>> would create a new copy of a,
>
>
> Thomas was talking about environments, not lists (and $ works 
> differently for them).
>
>> whereas in Java say
>>
>> HashMap a = new HashMap();
>> a.put("a", 1);
>> a.put("b", 2);
>>
>> wouldn't create a copy of a. (please bear in mind my java syntax is 
>> very rusty!)
>>
>> Caching data implies updating at each step, thus possibly creating n
>> copies of the list.  Is that wrong?
>
>
> It depends what you mean by `copy'.  If you expand a hash table you at 
> some point need to re-arrange the hashing of the entries.  That's what 
> will happen with an R environment or an R list too.  The possible 
> difference is that you might expect the table to have some room for 
> expansion, and in your list example you did not give any.
>
> R's operations on lists make more copies than are strictly necessary, 
> but it is not clear that this is more costly than the housekeeping 
> that would otherwise be necessary.  In a$b <- 4, the wrapper VECSXP is 
> recreated and the pointers copied across, but the list elements are 
> not copied.  For
> a$a <- 4 it is probable that no copying is done (although if a2 <- a 
> had been done previously, the pending recursive copy would then be done).
> (It is also possible that I have overlooked something in the rather 
> complex code used to do these subassignments.)
>
The original poster asked for caching of results. here is an example 
using new.env():

memo <- function(fun) {
   mem <- new.env() 
   function(x) {
       if (exists(as.character(x), envir=mem)) get(as.character(x), 
envir=mem, inherits=FALSE) else {
       val <- fun(x)
       assign(as.character(x), value=val, envir=mem)
       val }
   }
}

 > fib <- memo(function(n) if(n<=1) 1 else fib(n-1)+fib(n-2))
 > system.time( fib(300) )
[1] 0.01 0.00 0.02   NA   NA

ls(get("mem", env=environment(fib)))
   *output supressed*

To compare:
system.time( {fib2 <- function(n)if(n<=1)1 else 
fib2(n-1)+fib2(n-2);fib2(30)})
[1]  8.07  0.08 12.75    NA    NA


(there is (at least) one problem with this solution: if the global 
workspace contains a
 variable `6`, it gives error. Why?)

Kjetil

-- 

Kjetil Halvorsen.

Peace is the most effective weapon of mass construction.
               --  Mahdi Elmandjra




-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.




More information about the R-help mailing list