[R] Problem with caret + foreach + M5 combination

Antoine Stevens Antoine.Stevens at uclouvain.be
Wed Sep 21 11:53:15 CEST 2011


Hello,

I often use the caret package to develop regression models and compare
their performance. The foreach package is integrated in caret and can be
used to speed up the process through parallel computations.
Since caret version 5.01-001, one just need to register the cores with one
of the "do" packages (doMC,etc). This works fine in most of the
situations, but there is a problem when I use the M5 algorithm.
Everything works well with only one core but computations seem to be stuck
with 2 or more registered cores.
I am using Rstudio with R 2.13.1 on a Redhat Linux 64-bit machine.
Here is an example:

library(doMC);library(randomForest);library(RWeka); library(caret)
library(mlbench)
data(BostonHousing)
registerDoMC()
options(cores=1)
withoutMC <-  train(medv ~ ., data = BostonHousing, "rf")#Works with
random forest
options(cores=2)
usingMC <-  train(medv ~ ., data = BostonHousing, "rf")#Works with random
forest
options(cores=1)
withoutMC <-  train(medv ~ ., data = BostonHousing, "M5")#Works with M5
options(cores=2)
usingMC <-  train(medv ~ ., data = BostonHousing, "M5")#Does not work

So I tried with another parallel backend (doSNOW), but got another error.

library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
clusterEvalQ(cl,library(caret))
clusterEvalQ(cl,library(RWeka))

[[1]]
 [1] "RWeka"     "caret"     "foreach"   "codetools" "iterators" "cluster"
  "reshape"   "plyr"
 [9] "lattice"   "snow"      "methods"   "stats"     "graphics" 
"grDevices" "utils"     "datasets"
[17] "base"

[[2]]
 [1] "RWeka"     "caret"     "foreach"   "codetools" "iterators" "cluster"
  "reshape"   "plyr"
 [9] "lattice"   "snow"      "methods"   "stats"     "graphics" 
"grDevices" "utils"     "datasets"
[17] "base"

registerDoSNOW(cl)
usingSNOW <-  train(medv ~ ., data = BostonHousing, "M5")# Does not work

Error in { :
  task 1 failed - "could not find function "predictionFunction""

Here, it does not find "predictionFunction" (part of the caret package I
believe),
while I loaded the package into the clusters with clusterEvalQ.

Any suggestions?

Here are my sessionInfo() and package versions:

sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
LC_TIME=en_US.UTF-8
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=C             
LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] doSNOW_1.0.5       snow_0.3-7         caret_5.01-001    
cluster_1.14.0     reshape_0.8.4
 [6] plyr_1.6           lattice_0.19-30    mlbench_2.1-0      RWeka_0.4-8 
      randomForest_4.6-2
[11] doMC_1.2.3         multicore_0.1-7    foreach_1.3.2     
codetools_0.2-8    iterators_1.0.5

loaded via a namespace (and not attached):
[1] compiler_2.13.1   grid_2.13.1       rJava_0.9-1      
RWekajars_3.7.4-1 tools_2.13.1

packageDescription("caret")

Package: caret
Version: 5.01-001
Date: 2011-09-01
Title: Classification and Regression Training
Author: Max Kuhn. Contributions from Jed Wing, Steve Weston, Andre
Williams, Chris Keefer and
           Allan Engelhardt
Description: Misc functions for training and plotting classification and
regression models
Maintainer: Max Kuhn <Max.Kuhn at pfizer.com>
Depends: R (>= 2.10), lattice, reshape, stats, plyr, cluster, foreach
URL: http://caret.r-forge.r-project.org/
Suggests: gbm, pls, mlbench, rpart, ellipse, ipred, klaR, randomForest,
gpls, pamr, kernlab, mda,
           mgcv, nnet, class, MASS, mboost, earth (>= 2.2-3), party (>=
0.9-99992), ada, affy,
           proxy, e1071, grid, elasticnet, SDDA, caTools, RWeka (>=
0.4-1), superpc, penalized,
           sparseLDA (>= 0.1-1), spls, sda, glmnet, relaxo, lars, vbmp,
nodeHarvest, rrcov, gam,
           stepPlr, GAMens (>= 1.1.1), rocc, foba, partDSA, hda, fastICA,
neuralnet,
           quantregForest, rda, HDclassif, LogicReg, LogicForest, logicFS,
RANN, qrnn, Boruta,
           Hmisc, Cubist, bst, leaps
License: GPL-2
Packaged: 2011-09-02 14:09:50 UTC; kuhna03
Repository: CRAN
Date/Publication: 2011-09-02 18:25:30
Built: R 2.13.1; x86_64-redhat-linux-gnu; 2011-09-16 18:51:38 UTC; unix

Thank you very much,

Antoine Stevens
Earth and Life Institute
UCLouvain
Belgium



More information about the R-help mailing list