[Rd] Is there a way to disable / warn about forking?

Thomas Friedrichsmeier thomas.friedrichsmeier at ruhr-uni-bochum.de
Tue Oct 4 20:05:16 CEST 2011


Hi,

On Tuesday 04 October 2011, Simon Urbanek wrote:
> I don't see why this should be anything new - this is already happening
> since both packages that were folded into parallel (snow and multicore)
> are well known and well used.
> 
> In multicore we were explicitly warning about this and also working around
> issues where possible (e.g. the Mac GUI, for example). Judging by the
> widespread use of multicore and the absence of problem reports related to
> GUIs, my impression would be that this aspect is not really a problem
> (more below). We get more users confused about the inability to perform
> side-effects than this, for example.

Well, some users do heed the advice to address their problem reports to the 
package / GUI maintainers, esp., if they experience that the problem only 
occurs with the GUI loaded, not in a "plain" R session.

We've had a problem report about using mclapply() for a while in the RKWard 
bug tracker, already.
 
> In general, there are two main issues that can be addressed by the GUI:
> 
> a) shared file descriptors. This is a problem if the GUI uses FDs for
> communication and they are not closed in the child instance. You don't
> want both the child and the parent to process those FDs. E.g., closeAll()
> can be used to work around that issue and with parallel there could be an
> easier interface for this given that it's in core R.
> 
> b) event loop. If the GUI hooks into the event loop then, obviously, this
> is only intended to be run from the master. multicore was already
> disabling the even loop hook for AQUA, but it was hard to provide a more
> comprehensive solution since it needed cooperation of R. In parallel it's
> much easier, because it can modify R to allow the event loop conditionally
> and thus only in the master process.

For me the problem set was having multiple threads + mutexes, linking to a 
library that installs a SIGCHLD handler, code waiting for the "communicator" 
thread to negotiate something with the frontend, except that thread doesn't 
exist in the fork()ed child process...

After spending the day debugging, I think, I have finally solved the key issues 
for RKWard. That also means the issue is mostly painless for me, now. However, 
addressing fork()-related issues is not always a trivial exercise, and I 
continue to think that it could be useful for maintainers of "problematic" 
packages to have a way to stop / warn direct and indirect users running 
mcfork().

> The whole point of parallel is that it can do more than an external
> package, so I think you're going about it the wrong way - you should be
> talking to us much earlier so whatever your constraints in RKWard can be
> possibly addressed by the infrastructure. Also note that a lot of this
> should be seamless, a lot of users don't care what the infrastructure is,
> they just want their task to run in parallel, they don't care about
> mcfork() and the like - the choices will be made for them, because there
> is no fork on Windows, for example.

Exactly. I want the choice to be made for the user, where reasonably possible. 
My point is that knowing whether you're on Windows or a Unix is not enough to 
decide on the technique to use, in this case. Reliably enumerating all corner 
cases where forking could be a problem on Unix is probably next to impossible. 
The developers responsible for those corner cases have a decent chance to be 
aware of the problem, though. And thus, I think it would be a good idea, if 
they had a standard way of informing library(parallel), and any third party 
using library(parallel), if there is a problem with forking.

Regards
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20111004/177f4a2d/attachment.bin>


More information about the R-devel mailing list