[R] foreach takes foreever?

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Jan 21 17:27:29 CET 2013


Hi,

On Mon, Jan 21, 2013 at 10:59 AM, Andre Zege <azege at yahoo.com> wrote:
> I started to look at ways to improve times of certain very parallel tasks and thought that foreach should be a valid candidate to do the job.
> So, i opened foreach tutorial by Steve Weston and started timing examples from it. First example from tutorial is
>
>
>>system.time(for(i in 1:100000) sqrt(i))
>
>    user  system elapsed
>    0.06    0.00    0.06
>> system.time(foreach(i=1:100000) %do% sqrt(i))
>    user  system elapsed
>  102.37    0.21  103.38
>
> Hmm, 1700 time slower?
>
> second example is
>> system.time(x <- exp(1:1000000))
>    user  system elapsed
>    0.34    0.03    0.42
>>system.time(x <- foreach(i=1:1000000, .combine='c') %do% exp(i))
>
>
> I stopped it at 958 seconds, didn't have enough patience -- it basically seems that foreach  slows down this one down naive  by more than 2000 times. I must be  doing something very wrong. Am i supposed to set some environment variables before it works properly? I am running 64bit R on win7 dual core 2.27GHZ CPUs and 4GB memory laptop.

You should keep reading that vignette you are working from :-)

>From Section 5 "Parallel Execution":

"""
... But for the kinds of quick running operations that we’ve been
doing, there wouldn’t be much point to executing them in parallel.
Running many tiny tasks in parallel will usually take more time to
execute than running them sequentially, and if it already runs fast,
there’s no motivation to make it run faster anyway. But if the
operation that we’re executing in parallel takes a minute or longer,
there starts to be some motivation.
"""

The task you are parallelizing is too trivial. The time to coordinate
the data splitting + forking + etc. is more than just running sqrt.

When the specific task you are running within each iteration is more
involved, the benefit of parallelization will become more clear.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list