[R] conditional assignments and calculations

Michael Lachmann lachmann at eva.mpg.de
Sat Oct 2 20:52:32 CEST 2004


Hello!

I am using the TeXmacs interface to R. (Though I encountered a similar 
problem when using Sweave)
In doing calculations I often ecounter this scenario: I'll have some 
calculations in my file:
--
A=read.lots.of.data()

B=huge.calculation.on(A)

C=another.calculation.on(B)
--
Now, if A has already been read, I don't need to re-read it. If B has 
already been calculated, I don't need to recalculate it. But I would 
like to be able to just press 'enter' on each of them.

So, I would like R to somehow figure out dependencies (a bit like in 
Makefiles)

I implemented something like this with the following functions:
----------------------
touch=function(x) {attr(x,"last.updated")=Sys.time();x}

last.updated=function(a) {
   if( length(attr(a,"last.updated")) == 0 ) {
   	Sys.time()
   } else {
       attr(a,"last.updated")
   }
}


"depends<-"=function(a,value,...) {
   args=list(...)
   if( length(attr(a,"last.updated")) == 0 ) {
     a <- value
     a <-touch(a)
   } else {
     lu=(sapply(args,function(x) last.updated(x)-last.updated(a) > 0 ))
     if( sum(lu)>0 ) {
        a <- value
        a <-touch(a)
     }
   }
   a
}
------------------------
Then I can implement what I wanted above as follows:
--
if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) }

depends(B,A)=huge.calculation.on(A)
# this means the assignment 'B=huge.calculation.on(A)' is
# done only if A has been updated more recently than B.

depends(C,B)=another.calculation.on(B)
# dito for C more recent than B.
--
And now I can carelessly press 'enter' on these expression that might 
otherwise take hours to compute. Each variable has a datestamp of the 
last time it was updated, and I do each calculation conditional on 
whether certain variables have been recently changed. I can also save 
A,B,C to a file,later load them, and the calculations will not be redone.

But this solution is quite ugly, because of several problems:

1. To call 'depends(A,B)=f(B)' the first time, A has to already exist, 
otherwise I get an error (before I enter the "depends<-" function.)

2. I would also like to have a convenient way to do
"if( !exists(A) ) { A=read.lots.of.data(); A=touch(A) }"
maybe something like:
depends(A)<-read.lots.of.data()
But that doesn't work, because of 1.
or
A %set% read.lots.data()
But that doesn't work, because I haven't figured out a way for a 
function to change one of its variables.
(Maybe I could do A=A %set read.lots.of.data(), but that is really ugly...)

3. It would be nice to be able to do touch(A) instead of A=touch(A)

4. If I modify A without calling 'A=touch(A)', then B will not be 
updated next time I call 'depends(B,A)=huge.calculation.on(A)'. So it 
would be nice to have the variable's 'last updated' time updated 
automatically. (Though then it is a bit problematic to decide what the 
'last updated' time should be for variables loaded from a file...)


5. The whole thing is rather cludgy. But I haven't found a good way to 
implement it.

Suggestions?

Thanks,

    Michael




More information about the R-help mailing list