[Rd] Bug in parallel / mclapply

Bernd Bischl bernd_bischl at gmx.net
Fri Nov 16 23:48:13 CET 2012


Hi,

there seem to be some (small) bugs in the mclapply function in parallel.
I discovered this in the current R release version, and I checked that it is 
still present in R-devel.

I think it only occurs in the part of the code corresponding to argument option
mc.preschedule = FALSE.

Here are two examples:

a)
library(parallel)
mclapply(list(), identity, mc.preschedule=FALSE)
Error in sum(sapply(res, inherits, "try-error")) :
   invalid 'type' (list) of argument

Possible reason / fixes:

The relevant portion of the code is this:

if (!mc.preschedule) {
   FUN <- match.fun(FUN)
   if (length(X) <= cores) {
       jobs <- lapply(seq_along(X), function(i) mcparallel(FUN(X[[i]],
           ...), name = names(X)[i], mc.set.seed = mc.set.seed,
           silent = mc.silent))
       res <- mccollect(jobs)
       if (length(res) == length(X))
           names(res) <- names(X)
       has.errors <- sum(sapply(res, inherits, "try-error"))
   }

If there are 0 jobs, mccollect returns NULL, sapply on this returns the empty 
list, leads to an error in sum.
Possible fix might be to use vapply with FUN.VALUE instead.


b)

library(parallel)
foo <- function(i) NULL
mclapply(1, foo, mc.preschedule=FALSE)

Error in sum(sapply(res, inherits, "try-error")) :
   invalid 'type' (list) of argument

mclapply(1:2, foo, mc.preschedule=FALSE)
$`17222`
NULL

# should be list(NULL, NULL)


Possible reason / fixes:

The bug here is more complicated I think, and occurs in mccollect.

By using debug, I ended up in this line in mccollect:

res[[which(pid == pids)]] <- unserialize(r)

Here, the result of foo (=NULL) is not entered in the result list "res", because 
"<- NULL" is sementically interpreted as "delete list element".
(which has btw trapped me quite often, too...)
Therefore mccollect does not return a list of NULLs as it should, but only 
list(), which leads to the same error as in a).
In the case mclapply(1:2, foo, mc.preschedule=FALSE) something even more 
complicated happens, as on.exit we jump into a "cleanup" function which again 
calls mccollect. Anyway, the final result is wrong, too.

Possible fix:
Actually I am not sure what the best way here is, except for an extra if-case to 
avoid this when the assigment happens?

Best,

Bernd



More information about the R-devel mailing list