[R] Appropriateness of R functions for multicore

R. Michael Weylandt michael.weylandt at gmail.com
Sat Aug 24 02:56:09 CEST 2013


On Mon, Aug 19, 2013 at 2:08 PM, Patrick Connolly
<p_connolly at slingshot.co.nz> wrote:
> On Sat, 17-Aug-2013 at 05:09PM -0700, Jeff Newmiller wrote:
>
>
> |> In most threaded multitasking environments it is not safe to
> |> perform IO in multiple threads. In general you will have difficulty
> |> performing IO in parallel processing so it is best to let the
> |> master hand out data to worker tasks and gather results from them
> |> for storage. Keep in mind that just because you have eight cores
> |> for processing doesn't mean you have eight hard disks, so if your
> |> problem is IO bound in single processor operation then it will also
> |> be IO bound in threaded operation.
>
> For tasks which don't involve I/O but fail with mclapply, how does one
> work out where the problem is?  The handy browser() function which
> allows for interactive diagnosis won't work with parallel jobs.
>
> What other approaches can one use?
>

browser() requires I/O doesn't it?

Anyways -- my rule of thumb is anything involving "the outside world"
won't work so I/O covers most of it, but you also need to be aware of
things like RNG (which can involve subtle IO with a global random seed
if you're not aware of the concerns in the parallel package vignette).
More precisely, anything that would be hard in a functional language
(like Haskell) should keep you wary.

Cheers,
MW



More information about the R-help mailing list