[R] Efficient access to elements of a list of lists

Henrik Bengtsson hb at biostat.ucsf.edu
Sun Mar 11 19:34:42 CET 2012


On Sun, Mar 11, 2012 at 9:18 AM, Benilton Carvalho
<beniltoncarvalho at gmail.com> wrote:
> Hi,
>
> I have a long list of lists from which I want to efficiently extract
> and rbind elements. So I'm using the approach below:
>
>
> f <- function(i){
>    out <- replicate(5, list(matrix(rnorm(80), nc=20)))
>    names(out) <- letters[1:5]
>    out
> }
> set.seed(1)
> lst <- lapply(1:1.5e6, f)
> (t0 <- system.time(tmp <- do.call(rbind, lapply(lst, '[[', 'b'))))
>
>
> Is there anything better/faster than the do.call+rbind+lapply combo
> above?

The "[[" function involves method dispatching.  You can avoid that by
using .subset2().  That may save you some (micro?)seconds.

Now, if all extracted elements are truly of the same dimensions;

> bList <- lapply(lst, FUN='[[', 'b')
> str(head(bList))
List of 6
 $ : num [1:4, 1:20] 0.936 -0.844 -0.221 -0.581 -2.513 ...
 $ : num [1:4, 1:20] -0.2618 0.0259 -1.3131 -0.0547 -0.3296 ...
 $ : num [1:4, 1:20] -1.589 0.844 -1.121 0.21 -0.846 ...
 $ : num [1:4, 1:20] -1.192 -1.268 1.688 -0.295 0.466 ...
 $ : num [1:4, 1:20] 2.504 -0.833 -1.751 1.117 -0.775 ...
 $ : num [1:4, 1:20] 0.119 -0.313 1.741 0.403 -0.261 ...

then you can avoid the rbind(), by doing an unlist()/dim()/aperm(), e.g.

# Extract 'b' as an 4-by-20-by-1.5e6 array
dim <- dim(bList[[1]]);
n <- length(bList);
bArray <- unlist(bList, use.names=FALSE);
dimA <- c(dim, n);
dim(bArray) <- dimA;

# If you really need a matrix, then...

# Turing into a (4*1.5e6)-by-20 array
dimM <- dim;
dimM[1] <- n*dimM[1];
bMatrix <- aperm(bArray, perm=c(1,3,2));
dim(bMatrix) <- dimM;

You owe me a beer ;)

/Henrik

> On this example, the combo takes roughly 20s on my machine...
> but on the data I'm working with, it takes more than 1 minute... And
> given that I need to repeat the task several times, the cumul. amount
> of time is significant for me.
>
> Thank you for any suggestion/comment,
>
> benilton
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list