[R] error with more 100 forked processes

Henrik Bengtsson henr|k@bengt@@on @end|ng |rom gm@||@com
Fri Apr 8 23:45:51 CEST 2022


The reason why you hit the limit already around 100 workers, could be
because you already have other connections open, e.g. file
connections, capture.output(), etc.

If you want to use *forked* processing with more than 125 workers
using bare-bone R, you can use parallel::mclapply() and friends,
because they don't use sockets connections to communicate between the
main process and the workers.

If you don't need *forked* processing per se, there are other
alternatives, as already pointed out above.

As the author of the future framework (https://www.futureverse.org/),
I obviously suggest you try that one. It's on CRAN and installs out of
the box on all OSes. You get several alternatives for parallel
backends. For *forked* processing, call plan(multicore) on top of your
script, and it'll parallelize via the parallel::mclapply() framework
internally, so you won't have the connection limitation to worry
about(*). You can also use plan(future.callr::callr) to parallelize
via the callr package, which also don't have the connection
limitation. Your code will be the same regardless which you end up
using.  For the front end, there's future.apply::future_lapply() et
al. (parallel version of base lapply functions), furrr::future_map()
et al. (parallel version of purrr's map functions), foreach w/
doFuture if you like the y <- foreach(...) %dopar% { ... } style.

(*) But there are other issues with forked processing, e.g. it might
not be compatible with multi-threaded code used by some packages. This
is a problem independent of futures per se.

Hope this helps

Henrik

On Fri, Apr 8, 2022 at 2:19 PM Ivan Krylov <krylov.r00t using gmail.com> wrote:
>
> On Fri, 8 Apr 2022 22:02:25 +0200
> Guido Kraemer via R-help <r-help using r-project.org> wrote:
>
> >  > cl <- makeForkCluster(128)
> > Error in UseMethod("sendData") :
> >    no applicable method for 'sendData' applied to an object of class
> > "NULL"
>
> In order to communicate with the workers, R creates connection objects.
> Unfortunately, the memory for connection objects in R has a
> statically-defined limit of 128. (A few connections are used by
> default, and a few more will likely be used by user code during the
> actual program run.)
>
> Try increasing the limit in #define NCONNECTIONS in
> src/main/connections.c and re-compiling R.
>
> See also: https://github.com/HenrikBengtsson/Wishlist-for-R/issues/28
> According to Henrik Bengtsson, R should work well even with as many
> as 16381 possible connections, but then you may run into OS limits on
> file descriptors.
>
>
> --
> Best regards,
> Ivan
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list