[R] fast rowCumsums wanted for calculating the cdf

Fri Oct 15 18:59:00 CEST 2010

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Joshua Wiley
> Sent: Friday, October 15, 2010 12:23 AM
> To: Gregor
> Cc: r-help at r-project.org
> Subject: Re: [R] fast rowCumsums wanted for calculating the cdf
> 
> Hi,
> 
> You might look at Reduce().  It seems faster.  I converted the matrix
> to a list in an incredibly sloppy way (which you should not emulate)
> because I cannot think of  the simple way.
> 
> > probs <- t(matrix(rep(1:10000000), nrow=10)) # matrix with 
> row-wise probabilites
> > F <- matrix(0, nrow=nrow(probs), ncol=ncol(probs));
> > F[,1] <- probs[,1,drop=TRUE];
> > add <- function(x) {Reduce(`+`, x, accumulate = TRUE)}
> >
> >
> > system.time(F.slow <- t(apply(probs, 1, cumsum)))
>    user  system elapsed
>  36.758   0.416  42.464
> >
> > system.time(for (cc in 2:ncol(F)) {
> +  F[,cc] <- F[,cc-1,drop=TRUE] + probs[,cc,drop=TRUE];
> + })
>    user  system elapsed
>   0.980   0.196   1.328
> >
> > system.time(add(list(probs[,1], probs[,2], probs[,3], 
> probs[,4], probs[,5], probs[,6], probs[,7], probs[,8], 
> probs[,9], probs[,10])))
>    user  system elapsed
>   0.420   0.072   0.539

One way to avoid the typing of
   list(probs[,1], probs[,2], ...)
is to use split(probs, col(probs)).  However, split() is
surprisingly slow in this case.

Another approach is to use a matrix multiply, by a ncol(probs)
by ncol(probs) matrix with 0's in lower triangle and 1's
elsewhere.  Here are 4 approaches I've seen mentioned,
made into functions that output the matrix requested:

f1 <- function (x) t(apply(x, 1, cumsum))
f2 <- function (x){
    if (ncol(x) > 1) 
        for (i in 2:ncol(x)) x[, i] <- x[, i] + x[, i - 1]
    x
}
f3 <- function (x) 
    matrix(unlist(Reduce(`+`, split(x, col(x)), accumulate = TRUE)), 
      ncol = ncol(x))
f4 <- function (x) x %*% outer(seq_len(ncol(x)), seq_len(ncol(x)), "<=")

I tested it as follows
  > probs <- matrix(runif(1e7), ncol=10, nrow=1e6)
  > system.time(v1 <- f1(probs))
     user  system elapsed 
    19.45    0.25   16.78 
  > system.time(v2 <- f2(probs))
     user  system elapsed 
     0.68    0.03    0.79 
  > system.time(v3 <- f3(probs))
     user  system elapsed 
    38.25    0.24   34.47 
  > system.time(v4 <- f4(probs))
     user  system elapsed 
     0.70    0.05    0.56 
  > identical(v1,v2) && identical(v1,v3) && identical(v1,v4)
  [1] TRUE

If you have a fancy optimized/multithreaded BLAS linked
to your version of R there you may find that %*% is
much faster (I'm using the precompiled R for Windows).
As ncol(x) gets bigger the %*% approach probably loses
its advantage.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> >
> 
> .539 seconds for 10 vectors each with 1 million elements does not seem
> bad.  For 30000 each, on my system it took .014 seconds, which for
> 200,000 iterations, I estimated as:
> 
> > (200000*.014)/60/60
> [1] 0.7777778
> 
> (and this is on a stone age system with a single core processor and
> only 756MB of RAM)
> 
> It should not be difficult to get the list back to a matrix.
> 
> Cheers,
> 
> Josh
> 
> 
> 
> On Thu, Oct 14, 2010 at 11:51 PM, Gregor <mailinglist at gmx.at> wrote:
> > Dear all,
> >
> > Maybe the "easiest" solution: Is there anything that speaks 
> against generalizing
> > cumsum from base to cope with matrices (as is done in matlab)? E.g.:
> >
> > "cumsum(Matrix, 1)"
> > equivalent to
> > "apply(Matrix, 1, cumsum)"
> >
> > The main advantage could be optimized code if the Matrix is 
> extreme nonsquare
> > (e.g. 100,000x10), but the summation is done over the short 
> side (in this case 10).
> > apply would practically yield a loop over 100,000 elements, 
> and vectorization w.r.t.
> > the long side (loop over 10 elements) provides considerable 
> efficiency gains.
> >
> > Many regards,
> > Gregor
> >
> >
> >
> >
> > On Tue, 12 Oct 2010 10:24:53 +0200
> > Gregor <mailinglist at gmx.at> wrote:
> >
> >> Dear all,
> >>
> >> I am struggling with a (currently) cost-intensive problem: 
> calculating the
> >> (non-normalized) cumulative distribution function, given 
> the (non-normalized)
> >> probabilities. something like:
> >>
> >> probs <- t(matrix(rep(1:100),nrow=10)) # matrix with 
> row-wise probabilites
> >> F <- t(apply(probs, 1, cumsum)) #SLOOOW!
> >>
> >> One (already faster, but for sure not ideal) solution - 
> thanks to Henrik Bengtsson:
> >>
> >> F <- matrix(0, nrow=nrow(probs), ncol=ncol(probs));
> >> F[,1] <- probs[,1,drop=TRUE];
> >> for (cc in 2:ncol(F)) {
> >>   F[,cc] <- F[,cc-1,drop=TRUE] + probs[,cc,drop=TRUE];
> >> }
> >>
> >> In my case, probs is a (30,000 x 10) matrix, and i need to 
> iterate this step around
> >> 200,000 times, so speed is crucial. I currently can make 
> sure to have no NAs, but
> >> in order to extend matrixStats, this could be a nontrivial issue.
> >>
> >> Any ideas for speeding up this - probably routine - task?
> >>
> >> Thanks in advance,
> >> Gregor
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> -- 
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>