[Rd] MacOS parallel::makeCluster fails

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Mon Aug 12 16:29:32 CEST 2019


For reference, in the end it turned out to be caused by that "localhost" 
could not be resolved. Thanks to Dominik for the report and debugging of 
the problem. To be robust against such problems in the future, R now 
falls back to the loopback device for localhost.

Best
Tomas

On 7/12/19 11:22 AM, Dominik Leutnant wrote:
> Hi Thomas,
>
> thanks for your reply (and thanks for your patience...).
> I am now  using the following minimal reprex:
>
>> library(parallel)
>> cl <- makeCluster(2L)
> I freshly started the machine and did not open any other app. Just R.app (3.6.1).
>
> After executing the second line of code, R seems to hang infinitely and does not respond.
> The R process itself uses almost no CPU.
>
> Unfortunately, I do not have any experience with neither "Sock_listen"  nor "dtruss".
> Is there an example somewhere available?
>
> Best
> Dominik
>
>
>
>
> Am 05.06.19, 10:18 schrieb "Tomas Kalibera" <tomas.kalibera using gmail.com>:
>
>      Hi Dominik,
>      
>      from the output, the master process could not "listen" on the port where
>      it expects a connection from the worker. We need to find out why. I'd
>      recommend first to create a minimal reproducible example (and one that
>      does not use future, only parallel, and a minimal number of threads,
>      ideally just 2). Then I'd recommend to check if the problem still exists
>      with R-devel. Then I'd check if the problem happens in all invocations,
>      even after reboots, on a clean system, without many running applications
>      - if it does, this is good news. Then you could post such example and we
>      could help more - if we can reproduce on our system indeed we could
>      debug, if not there could at least be more directed advice on how to
>      debug on your side. What I'd do myself if I could reproduce on my system
>      would be instrument R around Sock_listen in internet module to see
>      exactly what has failed with which error. Maybe dtruss would help too,
>      but instrumenting may be easier. The earlier problem you mention has
>      never been diagnosed (it was only intermittent on the reporter's
>      machine, we could not reproduce on our systems, and despite a lot of
>      effort on our side and on the reporter's, we could not reliably
>      diagnose). In principle, it could be some race condition in R (one has
>      been fixed since the previous report), but especially if it is
>      deterministic it would more likely be some OS limit on your system. You
>      could of course try playing with OS limits, on the number of open files,
>      etc, with changing the port number (port= option), etc, but I would
>      recommend the systematic approach of debugging the cause.
>      
>      Best
>      Tomas
>      
>      On 6/4/19 10:45 AM, Dominik Leutnant wrote:
>      > Hi all,
>      >
>      > The call parallel::makeCluster(1L) hangs infinitely on my MacOS machine which seems to be already reported by some people (e.g., https://stat.ethz.ch/pipermail/r-devel/2018-February/075565.html).
>      > However, the solutions posted on SO, GH or R-devel do not work in my case.
>      >
>      > So far, I unsuccessfully tested …
>      >
>      >    1.  Couple of reboots
>      >    2.  Adding 192.0.0.1 to /etc/hosts
>      >    3.  Using R.app instead of RStudio.app
>      >    4.  Turn off the firewall
>      >
>      > Following Hendriks advice, “cl <- future::makeClusterPSOCK(1L, verbose = TRUE, timeout = 60)” gives (note: without adding the timeout parameter, R just hangs):
>      >> Sys.setenv(LANGUAGE='en')
>      >> cl <- future::makeClusterPSOCK(1L, verbose = TRUE, timeout = 60)
>      > [local output] Workers: [n = 1] ‘localhost’
>      > [local output] Base port: 11867
>      > [local output] Creating node 1 of 1 ...
>      > [local output] - setting up node
>      > Testing if worker's PID can be inferred: ‘'/Library/Frameworks/R.framework/Resources/bin/Rscript' -e 'try(cat(Sys.getpid(),file="/var/folders/5s/kgm05t2s0_52gz1s445mnlgw0000gn/T//RtmpZp1RX6/future.parent=835.3434fe0c5c6.pid"), silent = TRUE)' -e "file.exists('/var/folders/5s/kgm05t2s0_52gz1s445mnlgw0000gn/T//RtmpZp1RX6/future.parent=835.3434fe0c5c6.pid')"’
>      > - Possible to infer worker's PID: TRUE
>      > [local output] Starting worker #1 on ‘localhost’: '/Library/Frameworks/R.framework/Resources/bin/Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'try(cat(Sys.getpid(),file="/var/folders/5s/kgm05t2s0_52gz1s445mnlgw0000gn/T//RtmpZp1RX6/future.parent=835.3434fe0c5c6.pid"), silent = TRUE)' -e 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11867 OUT=/dev/null TIMEOUT=60 XDR=TRUE
>      > [local output] - Exit code of system() call: 0
>      > [local output] Waiting for worker #1 on ‘localhost’ to connect back
>      > [local output] Detected a warning from socketConnection(): ‘problem in listening on this socket’
>      > Killing worker process (PID 903) if still alive
>      > Worker (PID 903) was successfully killed: TRUE
>      > Error in socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,  :
>      >    Failed to launch and connect to R worker on local machine ‘localhost’ from local machine ‘Dominiks-MBP.local’.
>      > * The error produced by socketConnection() was: ‘cannot open the connection’
>      > * In addition, socketConnection() produced 1 warning(s):
>      >     - Warning #1: ‘problem in listening on this socket’
>      > * The localhost socket connection that failed to connect to the R worker used port 11867 using a communication timeout of 60 seconds and a connection timeout of 120 seconds.
>      > * Worker launch call: '/Library/Frameworks/R.framework/Resources/bin/Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'try(cat(Sys.getpid(),file="/var/folders/5s/kgm05t2s0_52gz1s445mnlgw0000gn/T//RtmpZp1RX6/future.parent=835.3434fe0c5c6.pid"), silent = TRUE)' -e 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11867 OUT=/dev/null TIMEOUT=60 XDR=TRUE.
>      > * Worker (PID 903) was successfully killed: TRUE
>      > * Troubleshooting suggestions:
>      >     - Suggestion #1: Set 'outfile=NULL' to see output from worker.
>      > In addition: Warning message:
>      > In socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,  :
>      >    problem in listening on this socket
>      >
>      > My session looks like:
>      >> sessionInfo()
>      > R version 3.6.0 (2019-04-26)
>      > Platform: x86_64-apple-darwin15.6.0 (64-bit)
>      > Running under: macOS Mojave 10.14.5
>      >
>      > Matrix products: default
>      > BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
>      > LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
>      >
>      > Random number generation:
>      > RNG:     Mersenne-Twister
>      >   Normal:  Inversion
>      >   Sample:  Rounding
>      >
>      > locale:
>      > [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8
>      >
>      > attached base packages:
>      > [1] stats     graphics  grDevices utils     datasets  methods   base
>      >
>      > loaded via a namespace (and not attached):
>      > [1] compiler_3.6.0
>      > Any help is greatly appreciated.
>      > Best regards
>      > Dominik
>      >
>      > Dr. Dominik Leutnant
>      >
>      > Muenster University of Applied Sciences
>      > Department of Civil Engineering
>      > Institute for Infrastucture·Water·Resources·Environment (IWARU)
>      > WG Urban Hydrology and Water Management
>      > Corrensstr. 25
>      > FRG-48149 Münster, Germany
>      >
>      > Tel.:  +49 (0) 251/83-65274
>      > Fax:  +49 (0) 251/83-65915
>      > Mail:  leutnant using fh-muenster.de<mailto:leutnant using fh-muenster.de>
>      > Web: https://www.fh-muenster.de/
>      >
>      > 	[[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > R-devel using r-project.org mailing list
>      > https://stat.ethz.ch/mailman/listinfo/r-devel
>      
>      
>      
>



More information about the R-devel mailing list