[R] Help needed to understand an error message produced from furrr and future packages

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Thu Nov 25 17:34:15 CET 2021


This question is off-topic here (see the Posting Guide, you are asking about a contributed package). Like walking down the street and asking this question, someone might know about it, but most will be puzzled.

You should know that multicore is quite sensitive to which kinds of operations you put into the worker threads, and you may be lucky that it returned an error because it can easily return an invalid result with no error. Getting reliable results with it is a bit of an art and may require detailed knowledge about your OS to avoid trouble with it. I don't claim such knowledge... good luck.

On November 25, 2021 4:19:09 AM PST, Hiroto Miyoshi <hiroto-miyoshi using e-mail.jp> wrote:
>Dear R-users
>
>I need help to understand the error message from furrr function.
>I am trying to build a parallel compute system which combines two
>desktop computers, one of which is a host computer, and runs ubuntu
>over wsl2, and the other is a slave, which runs ubuntu. as its OS.
>They are mutually connected on LAN.
>
>The host computer has 8 physical cores (16 logical cores), and the
>slave has 4 physical cores(8 logical cores).
>
>I wrote a code chunk, which is;
>
> > nodes<-c(rep("localhost",7),rep("192.168.1.11",4))
> > plan(list(tweak(cluster, workers = nodes),tweak(multicore,workers=2)))
> > system.time(VCtransfrm("typeIII"))
>
>in which VCtransfrm() is the target function, in which future_pmap and 
>future_map
>are being called tporogically.   The variable "typeIII" shows the file 
>which is sent to
>the VCtransfrm function. the typeIII file is the largest and has 165MB 
>of data while
>a typeII file is smaller and has only 7 MB of data.
>
>The chunk runs just fine when the typeII data is fed.  However, when the 
>typeIII data
>was fed, it gave the following error messages  and returned to the R 
>prompt.  Oddly,
>multiple R sessions were still running under the host computers when I 
>obsered its
>behaviour by the top command of ubuntu.  The error messages are:
>
>Error in unserialize(node$con) :
>   ClusterFuture (<none>) failed to receive results from cluster 
>RichSOCKnode #10 (PID 47955 on localhost ‘localhost’). The reason 
>reported was ‘error reading from connection’. Post-mortem diagnostic: No 
>process exists with this PID, i.e. the localhost worker is no longer 
>alive. Detected a non-exportable reference (‘externalptr’) in one of the 
>globals (‘...furrr_fn’ of class ‘function’) used in the future 
>expression. The total size of the 8 globals exported is 3.77 MiB. The 
>three largest globals are ‘...furrr_chunk_args’ (3.30 MiB of class 
>‘list’), ‘...furrr_fn’ (456.55 KiB of class ‘function’) and 
>‘...furrr_map_fn’ (11.91 KiB of class ‘function’)
>Timing stopped at: 2.285 4.291 37.53
>
>I hastily add that the part of "multicore" in the chunk is changed to 
>"multisession",
>the chunk runs without a problem even when the typeIII file is fed.
>
>I need to understand what this messages mean and how to fix this 
>problem.Since
>the chunk runs just fine for the smaller data, I reasoned that the 
>problem could not
>be a logical matter of the code.
>
>Please direct me to the solution of the problem.
>Any suggestion will be greatly appreciated.
>
>Sincerely,
>
>Hiroto
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list