[Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()

Henrik Bengtsson henr|k@bengt@@on @end|ng |rom gm@||@com
Thu Apr 11 22:06:47 CEST 2019


ISSUE:
Using *forks* for parallel processing in R is not always safe.  The
`parallel::mclapply()` function uses forked processes to parallelize.
One example where it has been confirmed that forked processing causes
problems is when running R via RStudio.  It is recommended to use
PSOCK clusters (`parallel::makeCluster()`) rather than *forked*
processes when running R from RStudio (
https://github.com/rstudio/rstudio/issues/2597#issuecomment-482187011).

AFAIK, it is not straightforward to disable forked processing in R.

One could set environment variable `MC_CORES=1` which will set R
option `mc.cores=1` when the parallel package is loaded.  Since
`mc.cores = getOption("mc.cores", 2L)` is the default for
`parallel::mclapply()`, this will cause `mclapply()` to fall back to
`lapply()` avoiding _forked_ processing.  However, this does not work
when the code specifies argument `mc.cores`, e.g. `mclapply(...,
mc.cores = detectCores())`.


SUGGESTION:
Introduce environment variable `R_ENABLE_FORKS` and corresponding R
option `enable.forks` that both take logical scalars.  By setting
`R_ENABLE_FORKS=false` or equivalently `enable.forks=FALSE`,
`parallel::mclapply()` will fall back to `lapply()`.

For `parallel::mcparallel()`, we could produce an error if forks are disabled.


Comments?

/Henrik



More information about the R-devel mailing list