[Rd] Speed of for loops

Byron Ellis byron.ellis at gmail.com
Wed Jan 31 00:23:56 CET 2007


Actually, why not use a closure to store previous value(s)?

In the simple case, which depends on x_i and y_{i-1}

gen.iter = function(x) {
    y = NA
    function(i) {
       y <<- if(is.na(y)) x[i] else y+x[i]
    }
}

y = sapply(1:10,gen.iter(x))

Obviously you can modify the function for the bookkeeping required to
manage whatever lag you need. I use this sometimes when I'm
implementing MCMC samplers of various kinds.


On 1/30/07, Herve Pages <hpages at fhcrc.org> wrote:
> Tom McCallum wrote:
> > Hi Everyone,
> >
> > I have a question about for loops.  If you have something like:
> >
> > f <- function(x) {
> >       y <- rep(NA,10);
> >       for( i in 1:10 ) {
> >               if ( i > 3 ) {
> >                       if ( is.na(y[i-3]) == FALSE ) {
> >                               # some calculation F which depends on one or more of the previously
> > generated values in the series
> >                               y[i] = y[i-1]+x[i];
> >                       } else {
> >                               y[i] <- x[i];
> >                       }
> >               }
> >       }
> >       y
> > }
> >
> > e.g.
> >
> >> f(c(1,2,3,4,5,6,7,8,9,10,11,12));
> >   [1] NA NA NA  4  5  6 13 21 30 40
> >
> > is there a faster way to process this than with a 'for' loop?  I have
> > looked at lapply as well but I have read that lapply is no faster than a
> > for loop and for my particular application it is easier to use a for loop.
> > Also I have seen 'rle' which I think may help me but am not sure as I have
> > only just come across it, any ideas?
>
> Hi Tom,
>
> In the general case, you need a loop in order to propagate calculations
> and their results across a vector.
>
> In _your_ particular case however, it seems that all you are doing is a
> cumulative sum on x (at least this is what's happening for i >= 6).
> So you could do:
>
> f2 <- function(x)
> {
>     offset <- 3
>     start_propagate_at <- 6
>     y_length <- 10
>     init_range <- (offset+1):start_propagate_at
>     y <- rep(NA, offset)
>     y[init_range] <- x[init_range]
>     y[start_propagate_at:y_length] <- cumsum(x[start_propagate_at:y_length])
>     y
> }
>
> and it will return the same thing as your function 'f' (at least when 'x' doesn't
> contain NAs) but it's not faster :-/
>
> IMO, using sapply for propagating calculations across a vector is not appropriate
> because:
>
>   (1) It requires special care. For example, this:
>
>         > x <- 1:10
>         > sapply(2:length(x), function(i) {x[i] <- x[i-1]+x[i]})
>
>       doesn't work because the 'x' symbol on the left side of the <- in the
>       anonymous function doesn't refer to the 'x' symbol defined in the global
>       environment. So you need to use tricks like this:
>
>         > sapply(2:length(x),
>                  function(i) {x[i] <- x[i-1]+x[i]; assign("x", x, envir=.GlobalEnv); x[i]})
>
>   (2) Because of this kind of tricks, then it is _very_ slow (about 10 times
>       slower or more than a 'for' loop).
>
> Cheers,
> H.
>
>
> >
> > Many thanks
> >
> > Tom
> >
> >
> >
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Byron Ellis (byron.ellis at gmail.com)
"Oook" -- The Librarian



More information about the R-devel mailing list