[R] untaring files in parallel with foreach and doSNOW?
aortizbobea at arec.umd.edu
Tue Jul 24 18:06:19 CEST 2012
I'm running some code that requires untaring many files in the first step.
This takes a lot of time and I'd like to do this in parallel, if possible.
If it's the disk reading speed that is the bottleneck I guess I should not
expect an improvement, but perhaps it's the processor. So I want to try this
I'm working on windows 7 with R 2.15.1 and the latest foreach and doSNOW
packages. See sessionInfo() below. Thanks in advance for any inputs!
# With lapply it works (i.e. each .tar.gz file is decompressed into several
directories with the files of interest inside)
# It also works with foreach in serial mode:
foreach(i=1:length(tar.files.vector)) %do% untar(tar.files.vector[i])
# However, foreach in parallel model gives an error....
foreach(i=1:length(tar.files.vector)) %dopar% untar(tar.files.vector[i])
Error in untar(tar.files.vector[i]) :
task 1 failed - "cannot open the connection"
Any ideas on how to address this problem (with these packages or other
Thanks in advance.
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)
 LC_COLLATE=English_United States.1252
 LC_CTYPE=English_United States.1252
 LC_MONETARY=English_United States.1252
 LC_TIME=English_United States.1252
attached base packages:
 stats graphics grDevices utils datasets methods base
other attached packages:
 doSNOW_1.0.6 snow_0.3-10 iterators_1.0.6 foreach_1.4.0
 raster_2.0-08 rgdal_0.7-12 sp_0.9-99
loaded via a namespace (and not attached):
 codetools_0.2-8 compiler_2.15.1 grid_2.15.1 lattice_0.20-6
View this message in context: http://r.789695.n4.nabble.com/untaring-files-in-parallel-with-foreach-and-doSNOW-tp4637614.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help