[Rd] Erlang-style message-passing in R: Rmpi, Snow, NetWorkSpaces, etc.

David Bauer astgtciv2008 at gatech.edu
Fri Sep 5 04:57:36 CEST 2008


> What would you say typically limits taskPR's approach, not finding
> enough instruction-level parallelism at the R script level, or the
> communications overhead (probably latency) of trying to make use of
> it?

Depends on the specific function.  The communication cost is 
significant, especially serialization and deserialization.  (Since I 
finally found the right way to force a flush of the TCP data, the actual 
network cost isn't a problem for moderate sized data.)  For reasons of 
simplicity of implementation and ease of correctness, a lot of the R 
environment is serialized and sent over with *each* operation.

In terms of the instruction-level parallelism available, code that is a 
performance bottle-neck is usually re-written in C or Fortran and called 
in large blocks.  So now the program is trying to find parallelism in 
the large blocks, which it usually can't.

I didn't have a lot of suitable code to try, and so the best example 
program was one that did a complex calculation followed by an accumulate 
operation in a loop.  Parallel-R/taskPR dynamically unrolled the loop 
(just like Tomosulo's algorithm does on a processor) and got a 
reasonable speedup (about half of linear).  Unfortunately, I don't even 
have that code example any more.


> If latency, then perhaps taskPR would work better in a multi-threaded
> R interpreter, rather than across a TCP/IP network fabric.

Yes, most especially if serialization and deserialization could be 
avoided.  However, I don't believe R is thread-safe?  (Using shared 
memory, but between multiple R processes, was on the TODO list when the 
project ended.)

I was fortunate to have access to a very large NUMA machine at the time 
that I was originally working on this project, so the network itself 
wasn't a limiting factor.  (The network stack turned out to be a 
problem, though.)


David Bauer



More information about the R-devel mailing list