[R] Parallel R

Luke Tierney luke at stat.uiowa.edu
Thu Jul 10 20:01:08 CEST 2008


pnmath currently uses up to 8 threads (i.e. 1, 2, 4, or 8).
getNumPnmathThreads() should tell you the maximum number used on your
system, which should be 8 if the number of processors is being
identified correctly.  With the size of m this calculation should be
using 8 threads, but the exp calculation is fairly fast, so the
overhead is noticable. On a Linux box with 4 dual-core AMD processors
I get

> m <- matrix(0, 10000, 1000)
> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.3859
> library(pnmath)
> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.0775

A similar example using qbeta, a slower function, gives

> p <- matrix(0.5,1000,1000)
> setNumPnmathThreads(1)
[1] 1
> mean(replicate(10, system.time(qbeta(p,2,3), gcFirst=TRUE))["elapsed",])
[1] 7.334
> setNumPnmathThreads(8)
[1] 8
> mean(replicate(10, system.time(qbeta(p,2,3), gcFirst=TRUE))["elapsed",])
[1] 0.9576


On an 8-core Intel/OS X box the improvement for exp is much less, but
is similar for qbeta.

luke


On Thu, 10 Jul 2008, Martin Morgan wrote:

> "Juan Pablo Romero Méndez" <jpablo.romero at gmail.com> writes:
>
>> Just out of curiosity, what system do you have?
>>
>> These are the results in my machine:
>>
>>> system.time(exp(m), gcFirst=TRUE)
>>    user  system elapsed
>>    0.52    0.04    0.56
>>> library(pnmath)
>>> system.time(exp(m), gcFirst=TRUE)
>>    user  system elapsed
>>   0.660   0.016   0.175
>>
>
> from cat /proc/cpuinfo, the original results were from a 32 bit
> dual-core system
>
> model name   : Intel(R) Core(TM)2 CPU         T7600  @ 2.33GHz
>
> Here's a second set of results on a 64-bit system with 16 core (4 core
> on 4 physical processors, I think)
>
>> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
> [1] 0.165
>> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
> [1] 0.0397
>
> model name   : Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
>
> One thing is that for me in single-thread mode the faster processor
> actually evaluates slower. This could be because of 64-bit issues,
> other hardware design aspects, the way I've compiled R on the two
> platforms, or other system activities on the larger machine.
>
> A second thing is that it appears that the larger machine only
> accelerates 4-fold, rather than a naive 16-fold; I think this is from
> decisions in the pnmath code about the number of processors to use,
> although I'm not sure.
>
> A final thing is that running intensive tests on my laptop generates
> enough extra heat to increase the fan speed and laptop temperature. I
> sort of wonder whether consumer laptops / desktops are engineered for
> sustained use of their multiple core (although I guess the gaming
> community makes heavy use of multiple cores).
>
> Martin
>
>
>
>>   Juan Pablo
>>
>>
>>>
>>>> system.time(exp(m), gcFirst=TRUE)
>>>   user  system elapsed
>>>  0.108   0.000   0.106
>>>> library(pnmath)
>>>> system.time(exp(m), gcFirst=TRUE)
>>>   user  system elapsed
>>>  0.096   0.004   0.052
>>>
>>> (elapsed time about 2x faster). Both BLAS and pnmath make much better
>>> use of resources, since they do not require multiple R instances.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-help mailing list