[R] No speed up using the parallel package and ncpus > 1 with boot() on linux machines

Milan Bouchet-Valat nalimilan at club.fr
Sat Oct 17 19:13:40 CEST 2015


Le samedi 17 octobre 2015 à 17:18 +0100, Chris Evans a écrit :
> I think I am failing to understand how boot() uses the parallel
> package on linux machines, using R 3.2.2 on three different machines
> with 2, 4 and 8 cores all results in a slow down if I use "multicore"
> and "ncpus".  Here's the code that creates a very simple reproducible
> example:
> 
> bootReps <- 500
> seed <- 12345
> set.seed(seed)
> require(boot)
> dat <- rnorm(500)
> bootMean <- function(dat,ind) {
>   mean(dat[ind])
> }
> start.time <- proc.time()
> bootDat <- boot(dat,bootMean,bootReps)
> boot.ci(bootDat,type="norm")
> stop.time <- proc.time()
> elapsed.time1 <- stop.time - start.time
> require(parallel)
> set.seed(seed)
> start.time <- proc.time()
> bootDat <- boot(dat,bootMean,bootReps,
>                 parallel="multicore",
>                 ncpus=2)
> boot.ci(bootDat,type="norm")
> stop.time <- proc.time()
> elapsed.time2 <- stop.time - start.time
> elapsed.time1
> elapsed.time2
> 
> Running that on my old Dell Latitude E6500 running Debian Squeeze and
> using 32 bit R 3.2.2 gives me:
> 
> > bootReps <- 500
> > seed <- 12345
> > set.seed(seed)
> > require(boot)
> > dat <- rnorm(500)
> > bootMean <- function(dat,ind) {
> +   mean(dat[ind])
> + }
> > start.time <- proc.time()
> > bootDat <- boot(dat,bootMean,bootReps)
> > boot.ci(bootDat,type="norm")
> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
> Based on 500 bootstrap replicates
> 
> CALL : 
> boot.ci(boot.out = bootDat, type = "norm")
> 
> Intervals : 
> Level      Normal        
> 95%   (-0.0034,  0.1677 )  
> Calculations and Intervals on Original Scale
> > stop.time <- proc.time()
> > elapsed.time1 <- stop.time - start.time
> > require(parallel)
> > set.seed(seed)
> > start.time <- proc.time()
> > bootDat <- boot(dat,bootMean,bootReps,
> +                 parallel="multicore",
> +                 ncpus=2)
> > boot.ci(bootDat,type="norm")
> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
> Based on 500 bootstrap replicates
> 
> CALL : 
> boot.ci(boot.out = bootDat, type = "norm")
> 
> Intervals : 
> Level      Normal        
> 95%   (-0.0030,  0.1675 )  
> Calculations and Intervals on Original Scale
> > stop.time <- proc.time()
> > elapsed.time2 <- stop.time - start.time
> > elapsed.time1
>    user  system elapsed 
>   0.028   0.000   0.174 
> > elapsed.time2
>    user  system elapsed 
>   4.336   2.572   0.166 
> 
> A very slightly different 95% CI reflecting the way that invoking
> parallel="multicore" changes the seed setting and a huge
> deterioration in execution speed rather than any improvement.
> 
> On a more recent four core Toshiba and using ncpus=4 again on Debian
> Squeeze, 32bit R, I get exactly the same CIs and this timing:
> 
> > elapsed.time1
> user system elapsed 
> 0.032 0.000 0.100 
> > elapsed.time2
> user system elapsed 
> 0.032 0.020 0.049 
> > 
> 
> and on a Mac Mini with eight cores on Squeeze but with 64bit R I get
> the same CIs and this timing:
> 
> > elapsed.time1
>    user  system elapsed 
>   0.012   0.004   0.017 
> > elapsed.time2
>    user  system elapsed 
>   0.032   0.012   0.024 
> 
> I am clearly missing something, or perhaps something else is choking
> the work, not the CPU power, RAM?  I've tried searching for similar
> reports on the web and was surprised to find nothing using what
> seemed plausible search strategies.
> 
> Anyone able to help me?  I'd desperately like to get a marked speed
> up for some simulation work I'm doing on the Mac mini as it's taking
> days to run at the moment.  The computational intensive bits in the
> models is a bit more complicated than this here (!) but most of the
> workload will be in the bootstrapping and the function I'm
> bootstrapping for real, although it's a bit more complex than a
> simple mean, isn't that complex though it does involve a stratified
> bootstrap rather than a simple one.  I see very similar marginal
> speed _losses_ invoking more than one core for that work just as with
> this very simple example.
Parallel execution is useful only when the operation you want to run
takes enough time. Here, starting the workers takes more time than
computing the means. You should try with a larger number of replicates,
or a slower computation.


Regards

> TIA,
> 
> Chris
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list