[Rd] R 3.0.1 : parallel collection triggers "long memory not supported yet"

Simon Urbanek simon.urbanek at r-project.org
Fri May 31 18:47:01 CEST 2013


On May 31, 2013, at 12:14 PM, ivo welch wrote:

> Dear R developers:
> 
> ...
> 7: lapply(seq_len(cores), inner.do)
> 8: FUN(1:3[[3]], ...)
> 9: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
> 
> Selection: .....................Error in sendMaster(try(lapply(X = S, FUN =
> FUN, ...), silent = TRUE)) :
>  long vectors not supported yet: memory.c:3100
> 
> 
> admittedly, my outcome will be a very big list, with 30,000 elements, each
> containing data frames with 14 variables and around 200 to 5000
> observations (say, 64KB on average).  thus, I estimate that the resulting
> list is 20GB.  the specific code that triggers this is
> 
> 
>    exposures.list <- mclapply(1:length(crsp.list.by.permno),
>                          FUN=function(i, NMO=NMO) {
> 
> calcbeta.for.one.stock(crsp.list.by.permno[[i]], NMO=NMO)
>                          },
>                          NMO=NMO, mc.cores=3 )
> 
> the release docs to 3.0.0 suggest this error should occur primarily in
> unusual situations.  so, it's not really a bug.  but I thought I would
> point this out.  maybe this is a forgotten updatedlet.
> 

mclapply uses sendMaster() to send the results (serialized into a raw vector) from the worker back to the parent R session. Apparently your serialized result from one worker is more than 2Gb. The multicore part of parallel currently doesn't support long vectors for the transmission so the result for one worker cannot exceed 2Gb. I'll put long vector support on my ToDo list. In your case you should be able to work around it by disabling pre-scheduling (you may want to do some grouping if you have 30,000 short iterations, though).

Cheers,
Simon



More information about the R-devel mailing list