[R] Thread parallelism and memory management on shared-memory supercomputers
andrewcd at gmail.com
Wed Dec 30 18:36:44 CET 2015
I've got allocations on a couple of shared memory supercomputers, which
I use to run computationally-intensive scripts on multiple cores of the
same node. I've got 24 cores on the one, and 48 on the other.
In both cases, there is a hard memory limit, which is shared among the
cores in the node. In the latter, the limit is 255G. If my job requests
more than that, the job gets aborted.
Now, I don't fully understand resource allocation in these sorts of
systems. But I do get that the sort of "thread parallelism" done by
e.g. the `parallel` package in R isn't identical to the sort of
parallelism commonly done in lower-level languages. For example, when I
request a node, I only ask for one of its cores. My R script then
detects the number of cores on the node, and farms out tasks to the
cores via the `foreach` package. My understanding is that lower-level
languages need the number of cores to be specified in the shell script,
and a particular job script is given directly to each worker.
My problem is that my parallel-calling R script is crashing the cluster,
which terminates my script because the sum of the memory being requested
by each thread is greater than what I'm allocated. I don't get this
problem when running on my laptop's 4 cores, presumably because my
laptop has a higher ratio of memory/core.
My question: how can I ensure that the total memory being requested by
N workers remains below a certain threshold? Is this even possible? If
not, is it possible to benchmark a process locally, collecting the
maximum per-worker memory requested, and use this to back out the number
of workers that I can request for a given node's memory limit?
Thanks in advance!
More information about the R-help