[Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()

Henrik Bengtsson henr|k@bengt@@on @end|ng |rom gm@||@com
Fri Jan 10 07:33:51 CET 2020


I'd like to pick up this thread started on 2019-04-11
(https://hypatia.math.ethz.ch/pipermail/r-devel/2019-April/077632.html).
Modulo all the other suggestions in this thread, would my proposal of
being able to disable forked processing via an option or an
environment variable make sense?  I've prototyped a working patch that
works like:

> options(fork.allowed = FALSE)
> unlist(parallel::mclapply(1:2, FUN = function(x) Sys.getpid()))
[1] 14058 14058
> parallel::mcmapply(1:2, FUN = function(x) Sys.getpid())
[1] 14058 14058
> parallel::pvec(1:2, FUN = function(x) Sys.getpid() + x/10)
[1] 14058.1 14058.2
> f <- parallel::mcparallel(Sys.getpid())
Error in allowFork(assert = TRUE) :
  Forked processing is not allowed per option ‘fork.allowed’ or
environment variable ‘R_FORK_ALLOWED’
> cl <- parallel::makeForkCluster(1L)
Error in allowFork(assert = TRUE) :
  Forked processing is not allowed per option ‘fork.allowed’ or
environment variable ‘R_FORK_ALLOWED’
>


The patch is:

Index: src/library/parallel/R/unix/forkCluster.R
===================================================================
--- src/library/parallel/R/unix/forkCluster.R (revision 77648)
+++ src/library/parallel/R/unix/forkCluster.R (working copy)
@@ -30,6 +30,7 @@

 newForkNode <- function(..., options = defaultClusterOptions, rank)
 {
+    allowFork(assert = TRUE)
     options <- addClusterOptions(options, list(...))
     outfile <- getClusterOption("outfile", options)
     port <- getClusterOption("port", options)
Index: src/library/parallel/R/unix/mclapply.R
===================================================================
--- src/library/parallel/R/unix/mclapply.R (revision 77648)
+++ src/library/parallel/R/unix/mclapply.R (working copy)
@@ -28,7 +28,7 @@
         stop("'mc.cores' must be >= 1")
     .check_ncores(cores)

-    if (isChild() && !isTRUE(mc.allow.recursive))
+    if (!allowFork() || (isChild() && !isTRUE(mc.allow.recursive)))
         return(lapply(X = X, FUN = FUN, ...))

     ## Follow lapply
Index: src/library/parallel/R/unix/mcparallel.R
===================================================================
--- src/library/parallel/R/unix/mcparallel.R (revision 77648)
+++ src/library/parallel/R/unix/mcparallel.R (working copy)
@@ -20,6 +20,7 @@

 mcparallel <- function(expr, name, mc.set.seed = TRUE, silent =
FALSE, mc.affinity = NULL, mc.interactive = FALSE, detached = FALSE)
 {
+    allowFork(assert = TRUE)
     f <- mcfork(detached)
     env <- parent.frame()
     if (isTRUE(mc.set.seed)) mc.advance.stream()
Index: src/library/parallel/R/unix/pvec.R
===================================================================
--- src/library/parallel/R/unix/pvec.R (revision 77648)
+++ src/library/parallel/R/unix/pvec.R (working copy)
@@ -25,7 +25,7 @@

     cores <- as.integer(mc.cores)
     if(cores < 1L) stop("'mc.cores' must be >= 1")
-    if(cores == 1L) return(FUN(v, ...))
+    if(cores == 1L || !allowFork()) return(FUN(v, ...))
     .check_ncores(cores)

     if(mc.set.seed) mc.reset.stream()

with a new file src/library/parallel/R/unix/allowFork.R:

allowFork <- function(assert = FALSE) {
    value <- Sys.getenv("R_FORK_ALLOWED")
    if (nzchar(value)) {
        value <- switch(value,
           "1"=, "TRUE"=, "true"=, "True"=, "yes"=, "Yes"= TRUE,
           "0"=, "FALSE"=,"false"=,"False"=, "no"=, "No" = FALSE,
            stop(gettextf("invalid environment variable value: %s==%s",
           "R_FORK_ALLOWED", value)))
value <- as.logical(value)
    } else {
        value <- TRUE
    }
    value <- getOption("fork.allowed", value)
    if (is.na(value)) {
        stop(gettextf("invalid option value: %s==%s", "fork.allowed", value))
    }
    if (assert && !value) {
      stop(gettextf("Forked processing is not allowed per option %s or
environment variable %s", sQuote("fork.allowed"),
sQuote("R_FORK_ALLOWED")))
    }
    value
}

/Henrik

On Mon, Apr 15, 2019 at 3:12 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>
> On 4/15/19 11:02 AM, Iñaki Ucar wrote:
> > On Mon, 15 Apr 2019 at 08:44, Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
> >> On 4/13/19 12:05 PM, Iñaki Ucar wrote:
> >>> On Sat, 13 Apr 2019 at 03:51, Kevin Ushey <kevinushey using gmail.com> wrote:
> >>>> I think it's worth saying that mclapply() works as documented
> >>> Mostly, yes. But it says nothing about fork's copy-on-write and memory
> >>> overcommitment, and that this means that it may work nicely or fail
> >>> spectacularly depending on whether, e.g., you operate on a long
> >>> vector.
> >> R cannot possibly replicate documentation of the underlying operating
> >> systems. It clearly says that fork() is used and readers who may not
> >> know what fork() is need to learn it from external sources.
> >> Copy-on-write is an elementary property of fork().
> > Just to be precise, copy-on-write is an optimization widely deployed
> > in most modern *nixes, particularly for the architectures in which R
> > usually runs. But it is not an elementary property; it is not even
> > possible without an MMU.
>
> Yes, old Unix systems without virtual memory had fork eagerly copying.
> Not relevant today, and certainly not for systems that run R, but indeed
> people interested in OS internals can look elsewhere for more precise
> information.
>
> Tomas
>



More information about the R-devel mailing list