[R] fast rowCumsums wanted for calculating the cdf

Gregor mailinglist at gmx.at
Tue Oct 12 10:24:53 CEST 2010


Dear all,

I am struggling with a (currently) cost-intensive problem: calculating the
(non-normalized) cumulative distribution function, given the (non-normalized)
probabilities. something like:

probs <- t(matrix(rep(1:100),nrow=10)) # matrix with row-wise probabilites
F <- t(apply(probs, 1, cumsum)) #SLOOOW!

One (already faster, but for sure not ideal) solution - thanks to Henrik Bengtsson:

F <- matrix(0, nrow=nrow(probs), ncol=ncol(probs));
F[,1] <- probs[,1,drop=TRUE];
for (cc in 2:ncol(F)) {
  F[,cc] <- F[,cc-1,drop=TRUE] + probs[,cc,drop=TRUE];
}

In my case, probs is a (30,000 x 10) matrix, and i need to iterate this step around
200,000 times, so speed is crucial. I currently can make sure to have no NAs, but in
order to extend matrixStats, this could be a nontrivial issue.

Any ideas for speeding up this - probably routine - task?

Thanks in advance,
Gregor



More information about the R-help mailing list