[R] No speed up using the parallel package and ncpus > 1 with boot() on linux machines

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Mon Oct 19 09:23:14 CEST 2015


Regarding cores... The only reliable way I have found so far is to look up the processor specs. In your case I found [1] which says 4 cores.

[1] http://ark.intel.com/m/products/64900/Intel-Core-i7-3615QM-Processor-6M-Cache-up-to-3_30-GHz#@product/specifications
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On October 18, 2015 2:31:13 AM PDT, Chris Evans <chrishold at psyctc.org> wrote:
>As with Milan's answer: perfect explanation and hugely appreciated.  A
>few follow up questions/comments below.
>
>----- Original Message -----
>> From: "Jeff Newmiller" <jdnewmil at dcn.davis.ca.us>
>> To: "Chris Evans" <chrishold at psyctc.org>
>> Cc: r-help at r-project.org
>> Sent: Saturday, 17 October, 2015 18:28:12
>> Subject: Re: [R] No speed up using the parallel package and ncpus > 1
>with boot() on linux machines
>
>> None of this is surprising. If the calculations you divide your work
>up
>> into are small, then the overhead of communicating between parallel
>> processes will be a relatively large penalty to pay.  You have to
>break
>> your problem up into larger chunks and depend on vector processing
>within
>> processes to keep the cpu busy doing useful work.
>
>Aha.  Got it!
> 
>> Also, I am not aware of any model of Mac Mini that has 8 physical
>cores...
>> 4 is the max. Virtual cores gain a logical simplification of
>> multiprocessing but do not offer actual improved performance because
>> there are only as many physical data paths and registers as there are
>> cores.
>
>Ah.  Hadn't thought of that.  It's a machine I rent, I thought it was a
>mac mini.  detectCores() reports 8 but perhaps they are virtual cores.
>/proc/cpuinfo says the processor is an Intel(R) Core(TM) i7-3615QM CPU
>@ 2.30GHz and shows 8 cores but again ... perhaps they are virtual. 
>What's the best way to get a true core count?
> 
>> Note that your problems are with long-running simulations... your
>examples
>> are too small to demonstrate the actual balance of processing vs.
>> communication overhead. Before you draw conclusions, try upping
>bootReps
>> by a few orders of magnitude, and run your test code a couple
>> of times to stabilize the memory conditions and obtain some
>consistency
>> in timings.
>
>OK.  Good advice again but what you are saying, and the findings I had
>there, are pretty consistent with what I was seeing with long running
>things with bootReps up at 10k and I think you've told me what I really
>want to know.  I think the simplest way to parallelise may actually be
>fine for me: I'll run four (or maybe eight) separate R jobs (having a
>look at swapping to make sure I'm not pushing beyond physical RAM,
>don't think these simulations will.
>
>> I have never used the parallel option in the boot package before... I
>have
>> always rolled my own to allow me to decide how much work to do within
>the
>> worker processes before returning from them. (This is particularly
>severe
>> when using snow, but not necessarily something you can neglect with
>> multicore.)
>
>That sounds like an impressive and obviously pertinent approach.  I
>think, as I say, I may be able to get away with a very simple approach
>that runs parallel simulations and then aggregates the data from each
>and analyses that.
>
>Many thanks Jeff.  Brilliant help.
>
>Chris
>
> 
>> On Sat, 17 Oct 2015, Chris Evans wrote:
>> 
>>> I think I am failing to understand how boot() uses the parallel
>package on linux
>
>... rest of my original post deleted to save space ...
>
> 
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>Go...
>>                                       Live:   OO#.. Dead: OO#.. 
>Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#. 
>rocks...1k
>>
>---------------------------------------------------------------------------
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list