[R] On the speed of apply and alternatives?

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 9 03:33:33 CEST 2006


Is testm really of class "matrix"?  If its a "data.frame" then manipulation
of matrices is often faster.

On 5/8/06, Monty B. <montezumasrevenge at gmail.com> wrote:
> Dear all,
>
> I have to handle a large matrix (1000 x 10001) where in the
> last column i have a value that all the preceding values in the same row
> has to be compared to.
>
> I have made the following code :
>
> # generate a (1000 x 10001) matrix, testm
> # generate statistics matrix 1000 x 4:
>
> qnt <- c(0.01, 0.05)
> cmp_fun  <- function(x)
> {
>  LAST <- length(x)
>  smpls <- x[1:(LAST-1)]
>  real  <- x[LAST]
>
>  ret <- vector(length=length(qnt)*2)
>  for (i in 1:length(qnt))
>  {
>    q_i  <- quantile(smpls, qnt[i])            # the quantile i
>    m_i <- mean(smpls[smpls<q_i ] )     # mean of obs less than q_i
>    ret[i] <- ifelse(real < q_i, 1, 0)
>    ret[length(qnt)+i] <- ifelse(real < q_i, real - m_i, 0)
>  }
>  ret
> }
> hcvx  <- apply(testm, 1, cmp_fun)
>
> The code is functioning well, but seems to take forever to calculate
> the statistics matrix. As I have to repeat this snippet 2000 times, I
> have a problem. Can anyone advise as to how I can optimize the runtime
> of this problem? Should i drop the apply function altogether and just
> loop through the rows with a for loop? Does anyone know of matrix
> functions I can use to do the same operations I use within the cmp_fun
> function to avoid this looping?
>
> All suggestions are welcome! I have little experience optimizing code
> in R, so I am quite stumped at the moment.
>
> Cheers,
>
> Monty




More information about the R-help mailing list