[R] Benefit of Iterators (package iterator)

Thierry Onkelinx thierry.onkelinx at inbo.be
Thu Dec 8 16:45:26 CET 2016


Dear Harold,

I get a different story

library(doParallel)
library(microbenchmark)
cl <- makeCluster(4)
registerDoParallel(cl)
x <- matrix(rnorm(1000000), ncol=1000)
itx <- iter(x, by='row')
microbenchmark(
  iterator = foreach(i=itx, .combine=c) %dopar% mean(i),
  base = foreach(i= 1:nrow(x), .combine=c) %dopar% mean(x[i,])
)

Unit: milliseconds
     expr       min         lq       mean     median         uq      max
neval cld
 iterator   2.11206   2.298507   6.254412   2.540116   2.691283  370.623
100  a
     base 390.21825 442.561737 550.169590 452.729684 466.343894 2554.329
100   b

Best regards,


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-12-08 15:20 GMT+01:00 Doran, Harold <HDoran op air.org>:

> R-Help (and package author)
>
> I'm trying to understand within the context of R what the benefit of using
> an iterator is. My only goal in using the foreach package is to improve
> computational speed with some embarrassingly parallel tasks I have to
> compute.
>
> I took the example found at the link below to provide a reproducible
> example and ran it in a "conventional" way to iterate in a loop and the
> timing suggests here (as well as with my actual project) that using an
> iterator generates the same object, but at a much slower speed.
>
> If I can get the same thing faster without using an iterator what would be
> the potential of its use?
>
> https://msdn.microsoft.com/en-us/microsoft-r/foreach
>
> > library(doParallel)
> > cl <- makeCluster(8)
> > registerDoParallel(cl)
> > x <- matrix(rnorm(1000000), ncol=1000)
> > itx <- iter(x, by='row')
> > system.time(r1 <- foreach(i=itx, .combine=c) %dopar% mean(i))
>    user  system elapsed
>    0.40    0.08    0.87
> > system.time(r2 <- foreach(i= 1:nrow(x), .combine=c) %dopar% mean(x[i,]))
>    user  system elapsed
>    0.41    0.03    0.81
> > all.equal(r1,r2)
> [1] TRUE
>
> ______________________________________________
> R-help op r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list