[R] vectorization of rolling function

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Mon Dec 8 09:25:46 CET 2014

```Please don't post in HTML... you may not recognize it, but the receiving
end does not necessarily (and in this case did not) look like the sending
end, and the cleanup can impede answers you are hoping to get.

In many cases, loops can be vectorized.  However, near as I can tell this
is an example of an algorithm that simply needs a loop [1].

One bit of advice: the coredata function is horribly slow. Just converting
your time series objects to numeric vectors for the purpose of this
computation sped up the algorithm by 500x on 10000 point series.
Converting it to inline C++ as below sped it up by yet another factor of
40x. 20000x is nothing to sneeze at.

#######
## optional temporary setup for windows
## assumes you have installed Rtools
gcc <- "C:\\Rtools\\bin"
rtools <- "C:\\Rtools\\gcc-4.6.3\\bin"
path <- strsplit(Sys.getenv("PATH"), ";")[[1]]
new_path <- c(rtools, gcc, path)
new_path <- new_path[!duplicated(tolower(new_path))]
Sys.setenv(PATH = paste(new_path, collapse = ";"))
## end of optional

library(Rcpp)

cppFunction(
"DataFrame EvapSimRcpp( NumericVector RR
, NumericVector ETmax
, const double Smax
, const double initialStorage ) {
int n = RR.size();

// create empty time-series to fill
// effective rainfall (i.e. rainfall minus intercepted rainfall)
NumericVector RReff( n );
// intercepted rainfall( n );
NumericVector Rint( n );
// residual potential evapotranspiration (ie ETmax minus
// evaporation from interception)
NumericVector ETres( n, NA_REAL );
double evap;

// volume of water in interception storage at start of
// computation
double storage = initialStorage;

for ( int i=0; i<n; i++ ) {
// compute interception capacity for time step i (maximum
// interception capacity minus any water intercepted but not
// evaporated during previous time-step).
Rint[ i ] = Smax - storage;
// compute intercepted rainfall: equal to rainfall if smaller
// than interception capacity, and to interception capacity if
// larger.
if ( RR[ i ] < Rint[ i ] ) Rint[ i ] = RR[ i ];
// compute effective rainfall (rainfall minus intercepted
// rainfall).
RReff[ i ] = RR[ i ] - Rint[ i ];
// update interception storage: initial interception storage +
// intercepted
// rainfall.
storage = storage + Rint[ i ];
// compute evaporation from interception storage: equal to
// potential evapotranspiration if the latter is smaller than
// interception storage, and to interception storage if larger.
if ( storage > ETmax[ i ] )
evap = ETmax[ i ];
else
evap = storage;
// compute residual potentiel evapotranspiration: potential
// evapotranspiration minus evaporation from interception
// storage.
ETres[ i ] = ETmax[ i ] - evap;
// update interception storage, to be carried over to next
// time-step: interception storage minus evaporation from
// interception storage.
storage = storage - evap;
}
DataFrame DF = DataFrame::create( Named( \"int\" ) = Rint
, Named( \"RReff\" ) = RReff
, Named( \"ETres\" ) = ETres
);
return DF;
}
")

EvapSimRcpp( RR, ETmax, Smax, 0 )

#######

[1] http://stackoverflow.com/questions/7153586/can-i-vectorize-a-calculation-which-depends-on-previous-elements

On Sat, 6 Dec 2014, A Duranel wrote:

> Hello
> I use R to run a simple model of rainfall interception by vegetation:
> rainfall falls on vegetation, some is retained by the vegetation (part of
> which can evaporate), the rest falls on the ground (quite crude but very
> similar to those used in SWAT or MikeSHE, for the hydrologists among you).
> It uses a loop on zoo time-series of rainfall and potential
> evapotranspiration. Unfortunately I did not find a way to vectorize it and
> it takes ages to run on long datasets. Could anybody help me to make it run
> faster?
>
> library(zoo)
> set.seed(1)
> # artificial potential evapotranspiration time-series
> ETmax<-zoo(runif(10, min=1, max=6), c(1:10))
> # artificial rainfall time-series
> RR<-zoo(runif(10, min=0, max=6), c(1:10))
>
> ## create empty time-series to fill
> # effective rainfall (i.e. rainfall minus intercepted rainfall)
> RReff<-zoo(NA, c(1:10))
> # intercepted rainfall
> int<-zoo(NA, c(1:10))
> # residual potential evapotranspiration (ie ETmax minus evaporation from
> interception)
> ETres<-zoo(NA, c(1:10))
>
> # define maximum interception storage capacity (maximum volume of rainfall
> that can be intercepted per time step, provided the interception store is
> empty at start of time-step)
> Smax<-3
> # volume of water in interception storage at start of computation
> storage<-0
>
> for (i in 1:length(ETmax)) {
>  # compute interception capacity for time step i (maximum interception
> capacity minus any water intercepted but not evaporated during previous
> time-step).
>  int[i]<-Smax-storage
>  # compute intercepted rainfall: equal to rainfall if smaller than
> interception capacity, and to interception capacity if larger.
>  if(RR[i]<int[i]) int[i]<-RR[i]
>  # compute effective rainfall (rainfall minus intercepted rainfall).
>  RReff[i]<-RR[i]-int[i]
>  # update interception storage: initial interception storage + intercepted
> rainfall.
>  storage<-storage+coredata(int[i])
>  # compute evaporation from interception storage: equal to potential
> evapotranspiration if the latter is smaller than interception storage, and
> to interception storage if larger.
>  if(storage>coredata(ETmax[i])) evap<-coredata(ETmax[i]) else evap<-storage
>  # compute residual potentiel evapotranspiration: potential
> evapotranspiration minus evaporation from interception storage.
>  ETres[i]<-ETmax[i]-evap
>  # update interception storage, to be carried over to next time-step:
> interception storage minus evaporation from interception storage.
>  storage<-storage-evap
> }
>
> Many thanks for your help!
>
> Arnaud
> UCL Department of Geography, UK
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/vectorization-of-rolling-function-tp4700487.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...