[R] R with multiple processors

Duncan Temple Lang duncan at research.bell-labs.com
Wed Feb 21 00:02:55 CET 2001



> Cc: Isabel Canette <isabelc at cmat.edu.uy>, <r-help at stat.math.ethz.ch>

This is a general response to some of the issues raised in the
discussion about utilizing multiple processors within R.  Sorry for
responding to this at such a late stage. Time seems to be passing me
by quicker than usual these days :-) 

(This is more suited to r-devel, but since the topic was on r-help,
it will hopefully conclude there.)


There are plans for multi-threading R, perhaps at the end of this
summer.  Doing this smoothly so that it doesn't break user-level code
and packages requires some planning and thought. I have a plan for
this and need to find time to implement it with anyone that is
interested.  The design I have in mind is much that same as John
Chambers and I implemented in S version 4, but implemented entirely
from scratch.  (See
http://cm.bell-labs.com/stat/doc/multi-threaded-S.ps. I'll make a
condensed version available when we get closer to implementing the
facilities.)

I am working on a several different packages that integrate R with
(potentially) threaded systems (e.g. Java, Perl, Python, Apache,
Netscape, XSLT, Postgres etc.) with the side effect of finding out
what support for threads is most urgently needed. Hopefully these will
stretch the model and show any deficiencies before we start.

Threads, connections, embedding R within databases, distributed
computing via CORBA/DCOM/SOAP/RMI, etc., all point to the fact that we
need to be working on parallel algorithms. It would be terrific to see
people exploring that.

* While threads do provide some simplification to the event loop
handling, it does so be merely formalizing the synchronization, not
removing it.

* Clustered or distributed computing is probably most efficient at
run-time for many statistical tasks, as Thomas (I believe) pointed
out. There are some facilities to do this for R and S-Plus.

* It would be good for developers of new C code to write it so that it
is thread-safe. There are some guidelines as to how to do this at
http://developer.r-project.org/RThreads/ (specifically in guide.html)

* From a practical perspective, there are a lot of pitfalls in loading
multi-threaded code into R. The order in which libraries are loaded,
interacting with the event loop if the multi-threaded code uses X11,
etc. are all important. One trick to simplify this is to embed R
within the multi-threaded application. This allows the multi-threaded
application to be (initially) in control and get its environment
appropriately organized before R enters the picture.



  Thanks for your time.
    D.

-- 
_______________________________________________________________

Duncan Temple Lang                duncan at research.bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-3217
700 Mountain Avenue, Room 2C-259  fax:    (908)582-3340
Murray Hill, NJ  07974-2070       
         http://cm.bell-labs.com/stat/duncan
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list