[Rd] Random behavior of mclapply

Thibault Vatter thib@ult@v@tter @ending from gm@il@com
Mon Oct 22 17:12:20 CEST 2018


Hi Tomas,

Thanks a lot for the explanation and the changes. The update in the
documentation is especially helpful.

Best,
Thibault




On Thu, Oct 18, 2018 at 10:48 AM Tomas Kalibera <tomas.kalibera using gmail.com>
wrote:

>
> Hi Thibault,
>
> mclapply has been designed to signal an error in two ways. User code
> errors are returned as special objects (of class "try-error") in the
> respective element of the result list. All other errors (including a
> process killed) are returned as NULL in the respective elements of the
> result list. To detect these errors reliably, one needs to implement FUN
> so that it never returns NULL normally (also it cannot return a raw
> vector). This is how mclapply was designed and implemented (and also
> mccollect, etc). It may be surprising to see multiple NULL elements when
> a single process is killed, but this is expected with pre-scheduling
> when that process has been tasked to compute multiple elements.
>
> To make this API more user friendly, I've added a warning that is now
> emitted when a job does not deliver a result (that is, when a vector
> element is NULL because of such error). I've also made it more explicit
> in the documentation that NULL signals an error.
>
> Best,
> Tomas
>
>
> On 07/26/2018 08:37 PM, Thibault Vatter wrote:
> > Hi,
> >
> > I wondered about the behavior described in the following stackoverflow
> > question:
> >
> >
> https://stackoverflow.com/questions/20674538/mclapply-returns-null-randomly
> >
> > More specifically, I would like to know if you ever considered the
> > suggestion made in the comments of the first answer, namely to somehow
> warn
> > the user if one of the processes has been killed by the out-of-memory
> > killer ?
> >
> > I am always surprised to see the random NULLs without
> message/warning/error
> > of any kind, and I think that it could be a useful feature to know
> whether
> > the function executed by mclapply returned a NULL or if the process was
> > killed for some reason.
> >
> > In the following gist, I have an example of this (in this case
> non-random)
> > behavior:
> >
> > https://gist.github.com/tvatter/2fcf3a9a99c256f9b9360f596b300715
> >
> > For the record, I generate the list of NULLs in the 4th mclapply in the
> > girst above with a late 2013 macbook pro with macOS High Sierra, 16GB of
> > memory, and my sessionInfo() is:
> >
> > R version 3.5.0 (2018-04-23)
> > Platform: x86_64-apple-darwin16.7.0 (64-bit)
> > Running under: macOS High Sierra 10.13.6
> >
> > Matrix products: default
> > BLAS:
> >
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> > LAPACK:
> >
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] parallel  stats     graphics  grDevices utils     datasets  methods
> >   base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_3.5.0 tools_3.5.0    yaml_2.1.19
> >
> > ------------------------------------------------------------
> > Thibault Vatter
> > Department of Statistics
> > Columbia University
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list