[R] Parallelizing GBM

Lorenzo Isella lorenzo.isella at gmail.com
Sun Mar 24 14:28:24 CET 2013


Thanks a lot for the quick answer.
However, from what I see, the parallelization affects only the  
cross-validation part in the gbm interface (but it changes nothing when  
you call gbm.fit).
Am I missing anything here?
Is there any fundamental reason why gbm.fit cannot be parallelized?

Lorenzo



On Sun, 24 Mar 2013 12:45:39 +0100, Max Kuhn <mxkuhn at gmail.com> wrote:

> See this:
>
>   https://code.google.com/p/gradientboostedmodels/issues/detail?id=3
>
>
> and this:
>
>   https://code.google.com/p/gradientboostedmodels/source/browse/?name=parallel
>
>
>
> Max
>
>
> On Sun, Mar 24, 2013 at 7:31 AM, Lorenzo Isella  
> <lorenzo.isella at gmail.com> wrote:
>
>> Dear All,
>>
>> I am far from being a guru about parallel programming.
>>
>> Most of the time, I rely or randomForest for data mining large datasets.
>>
>> I would like to give a try also to the gradient boosted methods in GBM,  
>> but I have a need for parallelization.
>>
>> I normally rely on gbm.fit for speed reasons, and I usually call it  
>> this way
>>
>>
>>
>>
>>
>>
>>
>> gbm_model <- gbm.fit(trainRF,prices_train,
>>
>> offset = NULL,
>>
>> misc = NULL,
>>
>> distribution = "multinomial",
>>
>> w = NULL,
>>
>> var.monotone = NULL,
>>
>> n.trees = 50,
>>
>> interaction.depth = 5,
>>
>> n.minobsinnode = 10,
>>
>> shrinkage = 0.001,
>>
>> bag.fraction = 0.5,
>>
>> nTrain = (n_train/2),
>>
>> keep.data = FALSE,
>>
>> verbose = TRUE,
>>
>> var.names = NULL,
>>
>> response.name = NULL)
>>
>>
>>
>>
>>
>> Does anybody know an easy way to parallelize the model (in this case it  
>> means simply having 4 cores on the same >>machine working on the  
>> problem)?
>>
>> Any suggestion is welcome.
>>
>> Cheers
>>
>>
>>
>> Lorenzo
>>
>>
>>
>> ______________________________________________
>>
>> R-help at r-project.org mailing list
>>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>>
>> PLEASE do read the posting guide  
>> http://www.R-project.org/posting-guide.html
>>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Max



More information about the R-help mailing list