[R] project parallel help

Jeffrey Flint jeffrey.flint at gmail.com
Mon Oct 14 22:25:27 CEST 2013


I'm running package parallel in R-3.0.2.

Below are the execution times using system.time for when executing
serially versus in parallel (with 2 cores) using parRapply.


Serially:
   user  system elapsed
   4.67    0.03    4.71



Using package parallel:
   user  system elapsed
   3.82    0.12    6.50



There is evident improvement in the user cpu time, but a big jump in
the elapsed time.

In my code, I am executing a function on a 1000 row matrix 100 times,
with the data different each time of course.

The initial call to makeCluster cost 1.25 seconds in elapsed time.
I'm not concerned about the makeCluster time since that is a fixed
cost.  I am concerned about the additional 1.43 seconds in elapsed
time (6.50=1.43+1.25).

I am wondering if there is a way to structure the code to avoid
largely avoid the 1.43 second overhead.  For instance, perhaps I could
upload the function to both cores manually in order to avoid the
function being uploaded at each of the 100 iterations?    Also, I am
wondering if there is a way to avoid any copying that is occurring at
each of the 100 iterations?


Thank you.

Jeff Flint



More information about the R-help mailing list