[R] Vectorize rearrangement within each column

Roberto Osorio roboso at gmail.com
Fri Jan 19 20:17:32 CET 2007


Thanks for the solutions. Here are some time tests for ma and idx
being 100 X 100,000. The machine is a 2.16 GHz Intel MacBook Pro with
2 GB memory.

ma <- matrix(rnorm(1e7), nr = 100)      # 100 X 100,000
idx <- matrix(round( runif(1e7, 1, 100) ), nr = 100)

# Original:

system.time( {
    mb <- ma;
    for (j in 1:1e5) mb[,j] <- ma[idx[j],j]
} )
[1] 1.354 0.087 1.435 0.000 0.000

# Prof. Venables' version:

system.time( mb[] <- as.vector(ma)[as.vector(idx +
       outer(rep(nrow(ma), nrow(ma)), 1:ncol(ma)-1, '*'))] )
[1] 0.885 0.857 2.262 0.000 0.000

# Patrick Burns' version:

system.time( {
    mb <- ma[cbind(as.vector(idx), as.vector(col(idx)))];
    dim(mb) <- dim(ma)
} )
[1] 1.672 0.615 2.277 0.000 0.000

# Gabor Grothendieck's version led to some memory handling issue. I
stepped one order of magnitude down in the number of columns but it's
still very slow.

> ma <- matrix(rnorm(1e6), nr = 100)           # 100 X 10,000
> idx = matrix(round( runif(1e6, 1, 100) ), nr = 100)
> system.time( as.matrix(mapply("[", as.data.frame(ma), as.data.frame(idx))) )
[1] 2.060 0.133 2.768 0.000 0.000

So, Prof. Venables' solution is the fastest. In view of only moderate
time savings, I will take his advice and keep the original loop for
code clarity.

Roberto Osorio
------



More information about the R-help mailing list