[R] Please explain "do.call" in this context, or critique to "stack this list faster"

baptiste auguie baptiste.auguie at googlemail.com
Sun Sep 5 14:03:01 CEST 2010


Another way that I like is reshape::melt.list() because it keeps track
of the name of the original data.frames,

l = replicate(1e4, data.frame(x=rnorm(100),y=rnorm(100)), simplify=FALSE)
system.time(a <- rbind.fill(l))
#   user  system elapsed
# 2.482   0.111   2.597
system.time(b <- melt(l,id=1:2))
#   user  system elapsed
#  6.556   0.229   6.801
system.time(c <- do.call(rbind, l))
#  user  system elapsed
# 55.020  71.356 129.300

all.equal(a, b[ , -3])
#[1] TRUE

baptiste

On 5 September 2010 04:48, Hadley Wickham <hadley at rice.edu> wrote:
>> One common way around this is to pre-allocate memory and then to
>> populate the object using a loop, but a somewhat easier solution here
>> turns out to be ldply() in the plyr package. The following is the same
>> idea as do.call(rbind, l), only faster:
>>
>>> system.time(u3 <- ldply(l, rbind))
>>   user  system elapsed
>>   6.07    0.01    6.09
>
> I think all you want here is rbind.fill:
>
>> system.time(a <- rbind.fill(l))
>   user  system elapsed
>  1.426   0.044   1.471
>
>> system.time(b <- do.call("rbind", l))
>   user  system elapsed
>     98      60     162
>
>> all.equal(a, b)
> [1] TRUE
>
> This is considerably faster than do.call + rbind because I spend a lot
> of time working out how to do this most efficiently. You can see the
> underlying code at http://github.com/hadley/plyr/blob/master/R/rbind.r
> - it's relatively straightforward except for ensuring the output
> columns are the same type as the input columns.  This is a good
> example where optimised R code is much faster than C code.
>
> Hadley
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
____________________

Dr. Baptiste Auguié

Departamento de Química Física,
Universidade de Vigo,
Campus Universitario, 36310, Vigo, Spain

tel: +34 9868 18617
http://webs.uvigo.es/coloides



More information about the R-help mailing list