[Rd] Process to Incorporate Functions from {parallely} into base R's {parallel} package

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Wed Nov 11 11:02:47 CET 2020


>>>>> Duncan Murdoch 
>>>>>     on Sat, 7 Nov 2020 15:44:32 -0500 writes:

    > If these are easy changes, maybe someone will incorporate
    > them.  You'll make the argument stronger for doing that if
    > you can explain why it's better to do that than to keep
    > them in parallely.

    > Duncan Murdoch

Thank you, Duncan, Henrik, and James Joseph.

>From reading, I agree that this is something worth updating in
R's own `parallel` (and I have tried and checked it does not
break our own  'make check-all').

Henrik (or anyone): Is there a small repr.ex. I could add to
parallel/tests/*.R which will show the advantage of allowing an
empty 'user'  here?

Martin Maechler


    > On 07/11/2020 1:39 p.m., Henrik Bengtsson wrote:
    >> FWIW, there are indeed a few low hanging bug fixes in
    >> 'parallelly' that should be easy to incorporate into
    >> 'parallel' without adding extra maintenance.  For
    >> example, in parallel::makePSOCKcluster(), it is not
    >> possible to disable SSH option '-l USER' so that it can
    >> be set in ~/.ssh/config.  The remote user name will be
    >> the user name of your local machine and if you try to set
    >> user=NULL, you'll end up with an invalid SSH call.  The
    >> current behavior means that you are forced to specify the
    >> remote user name in your R code.  All that it takes is to
    >> fix this is to update:
    >> 
    >> cmd <- paste(rshcmd, "-l", user, machine, cmd)
    >> 
    >> to something like:
    >> 
    >> cmd <- paste(rshcmd, if (length(user) == 1L) paste("-l",
    >> user), machine, cmd)
    >> 
    >> This is one example of what I've patched in
    >> parallelly::makeClusterPSOCK() over the years.  Another
    >> is the use of reverse tunneling in SSH - that completely
    >> avoids the need to know and specify your public IP and
    >> reconfiguring the firewalls from the remote server back
    >> to your local machine so that the worker can connect back
    >> to your local machine.  Not many users have the
    >> permission to reconfigure firewalls and it's also
    >> extremely tedious.  Reverse SSH tunneling is super
    >> simply; all you need to to is something like:
    >> 
    >> rshopts <- c(sprintf("-R %d:%s:%d", rscript_port, master,
    >> port), rshopts)
    >> 
    >> /Henrik
    >> 
    >> On Fri, Nov 6, 2020 at 4:37 PM Duncan Murdoch
    >> <murdoch.duncan using gmail.com> wrote:
    >>> 
    >>> On 06/11/2020 4:47 p.m., Balamuta, James Joseph wrote:
    >>>> Hi all,
    >>>> 
    >>>> Henrik Bengtsson has done some fantastic work with
    >>>> {future} and, more importantly, greatly improved
    >>>> constructing and deconstructing a parallelized
    >>>> environment within R. It was with great joy that I saw
    >>>> Henrik slowly split off some functionality of {future}
    >>>> into {parallelly} package. Reading over the package’s
    >>>> README, he states:
    >>>> 
    >>>>> The functions and features added to this package are
    >>>>> written to be backward compatible with the parallel
    >>>>> package, such that they may be incorporated there
    >>>>> later.  The parallelly package comes with an open
    >>>>> invitation for the R Core Team to adopt all or parts
    >>>>> of its code into the parallel package.
    >>>> 
    >>>> https://github.com/HenrikBengtsson/parallelly
    >>>> 
    >>>> I’m wondering what the appropriate process would be to
    >>>> slowly merge some functions from {parallelly} into the
    >>>> base R {parallel} package. Should this be done with
    >>>> targeted issues on Bugzilla for different fields Henrik
    >>>> has identified? Or would an omnibus patch bringing in
    >>>> all suggested modifications be preferred? Or is it best
    >>>> to discuss via the list-serv appropriate contributions?
    >>> 
    >>> One way is to convince R Core that incorporating this
    >>> into the parallel package would
    >>> 
    >>> - make less work for them, or - add a lot to R that
    >>> couldn't happen if it was a contributed package.
    >>> 
    >>> The fact that it's good isn't a good reason to put it
    >>> into a base package, which would largely mean
    >>> transferring Henrik's workload to R Core.  There are
    >>> lots of good packages, and their maintainers should
    >>> continue to maintain them.
    >>> 
    >>> Duncan Murdoch
    >>> 
    >>> ______________________________________________
    >>> R-devel using r-project.org mailing list
    >>> https://stat.ethz.ch/mailman/listinfo/r-devel

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list