[R] Problem parallelizing across cores

James Spottiswoode j@me@@@pott|@woode @end|ng |rom gm@||@com
Wed Aug 28 20:06:57 CEST 2019


Hi All,

I have a piece of well optimized R code for doing text analysis running
under Linux on an AWS instance.  The code first loads a number of packages
and some needed data and the actual analysis is done by a function called,
say, f(string).  I would like to parallelize calling this function across
the 8 cores of the instance to increase throughput.  I have looked at the
packages doParallel and future but am not clear how to do this.  Any method
that brings up an R instance when the function is called will not work for
me as the time to load the packages and data is comparable to the execution
time of the function leading to no speed up.  Therefore I need to keep a
number of instances of the R code running continuously so that the data
loading only occurs once when the R processes are first started and
thereafter the function f(string) is ready to run in each instance.  I hope
I have put this clearly.

I’d much appreciate any suggestions.  Thanks in advance,

James Spottiswoode


--

	[[alternative HTML version deleted]]



More information about the R-help mailing list