[Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()
tr@ver@c @end|ng |rom gm@||@com
Fri Apr 12 21:31:24 CEST 2019
Just throwing my two cents in:
I think removing/deprecating fork would be a bad idea for two reasons:
1) There are no performant alternatives
2) Removing fork would break existing workflows
Even if replaced with something using the same interface (e.g., a
function that automatically detects variables to export as in the
amazing `future` package), the lack of copy-on-write functionality
would cause scripts everywhere to break.
A simple example illustrating these two points:
`x <- 5e8; mclapply(1:24, sum, x, 8)`
Using fork, `mclapply` takes 5 seconds. Using "psock", `clusterApply`
does not complete.
On Fri, Apr 12, 2019 at 2:32 AM Iñaki Ucar <iucar using fedoraproject.org> wrote:
> On Thu, 11 Apr 2019 at 22:07, Henrik Bengtsson
> <henrik.bengtsson using gmail.com> wrote:
> > ISSUE:
> > Using *forks* for parallel processing in R is not always safe.
> > [...]
> > Comments?
> Using fork() is never safe. The reference provided by Kevin  is
> pretty compelling (I kindly encourage anyone who ever forked a process
> to read it). Therefore, I'd go beyond Henrik's suggestion, and I'd
> advocate for deprecating fork clusters and eventually removing them
> from parallel.
>  https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
> Iñaki Úcar
> R-devel using r-project.org mailing list
More information about the R-devel