[R] problem with the use of parallel foreach

Vivek Sutradhara viveksutra at gmail.com
Tue Dec 22 13:41:01 CET 2015


Hi,
I am having a problem with the use of the foreach package. It is strange
that my code works when i use the %do% function but not with %dopar%.

Let me explain. I am new to parallel and foreach packages. I have data in
the form of very large files, and they are in the form of data tables. I
have saved them as rds files, for taking advantage of the compression
capability.

I will try to make reproducible example as follows :

setwd('C:/Rtrials/parallelTrials')
library(parallel)
#no_ofCores<-detectCores()
library(doSNOW)
cl <- makeCluster(2, type="SOCK")
registerDoSNOW(cl)

library(data.table)
# example data
mt<-data.table(mtcars)
#split mtcars into 4 data.tables and save into 4 rds files (to mimic my
file structure)
nlow<-1
for (i in 1:4) {
  filei<-paste0('mt',i,'.rds')
  nhigh<-i*8
  mti<-mt[nlow:nhigh]
  saveRDS(mti,file=filei)
  nlow<-i*8+1
}

# read the files in parallel and aggregate
mt6<-foreach(j=1:4,.combine='rbind') %dopar% {           # works with %do%
  filej<-paste0('mt',j,'.rds')
  mtj<-readRDS(filej)
  mtj[cyl == 6]
}
stopCluster(cl)

I get the following error message when using %dopar% :

Error in { : task 1 failed - "object 'cyl' not found"

When I change the %dopar% command to %do%, I do not get an error
message. What is the problem in the use of %dopar%?

I would appreciate help in troubleshooting.

Instead of the foreach loop, I tried the same with a for loop. After
saving the aggregated result, I had to delete the table from the
currently read file, do garbage collection and then read in a new
file. Something like the following :

dtAll<-mt[0]
for (j in 1:25) {
  filetxt<-paste0('mt',j,'.rds')
  dtj<-readRDS(filetext)
  dtAll<-rbind[list(dtAll,dtj)]
  rm(dtj);gc()
}

How is garbage collection handled in parallel computing? With the .combine
= 'rbind' option, this may not be necessary. Could somebody comment on
this? Would it be better to use the 'rbindlist' option instead of 'rbind'?

First, I would like to know what my problem with %dopar% is.

Thanks for any help that I can get.


Vivek

	[[alternative HTML version deleted]]



More information about the R-help mailing list