[R] R parallel - slow speed

Martin Spindler Martin.Spindler at gmx.de
Thu Jul 30 14:26:34 CEST 2015


Dear all,

I am trying to parallelize the function npnewpar given below. When I am comparing an application of "apply" with "parApply" the parallelized version seems to be much slower (cf output below). Therefore I would like to ask how the function could be parallelized more efficient. (With increasing sample size the difference becomes smaller, but I was wondering about this big differences and how it could be improved.)

Thank you very much for help in advance!

Best,

Martin


library(microbenchmark)
library(doParallel)

n <- 500
y <- rnorm(n)
Xc <- rnorm(n)
Xd <- sample(c(0,1), replace=TRUE)
Weights <- diag(n)
n1 <- 50
Xeval <- cbind(rnorm(n1), sample(c(0,1), n1, replace=TRUE))


detectCores()
cl <- makeCluster(4)
registerDoParallel(cl)
microbenchmark(apply(Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, Weights=Weights, h=0.5),  parApply(cl, Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, Weights=Weights, h=0.5), times=100)
stopCluster(cl)


Unit: milliseconds
                                                                                       expr       min        lq      mean    median
        apply(Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights,      h = 0.5)  4.674914  4.726463  5.455323  4.771016
 parApply(cl, Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights,      h = 0.5) 34.168250 35.434829 56.553296 39.438899
        uq       max neval
  4.843324  57.01519   100
 49.777265 347.77887   100














npnewpar <- function(y, Xc, Xd, Weights, h, xeval) {
  xc <- xeval[1]
  xd <- xeval[2]
  l <- function(x,X) {
    w <-  Weights[x,X]
    return(w)
  }
  u <- (Xc-xc)/h
  #K <- kernel(u)
  K <- dnorm(u)
  L <- l(xd,Xd)
  nom <- sum(y*K*L)
  denom <- sum(K*L)
  ghat <- nom/denom
  return(ghat)
}



More information about the R-help mailing list