[R] boosting - second posting

Kuhn, Max Max.Kuhn at pfizer.com
Tue May 30 15:17:58 CEST 2006


The family arg appears to be the problem. Either bernoulli or adaboost
are appropriate for classification problems.

Max

> Perhaps by following the Posting Guide you're likely to get more
helpful
> responses.  You have not shown an example that others can reproduce,
not
> given version information for R or gbm.  The output you showed does
not use
> type="response", either.
>  
> Andy
> 
>   _____  
> 
> From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc
> Sent: Sat 5/27/2006 4:02 PM
> To: 'R Help'
> Subject: [R] boosting - second posting [Broadcast]
> 
> 
> 
> Hi 
>   
> I am using boosting for a classification and prediction problem. 
>   
> For some reason it is giving me an outcome that doesn't fall between 0

> and 1 for the predictions.  I have tried type="response" but it made
no 
> difference. 
>   
> Can anyone see what I am doing wrong? 
>   
> Screen output shown below: 
>   
>   
> > boost.model <- gbm(as.factor(train$simNuance) ~ .,         # formula

> +          data=train,                   # dataset 
> +                                       # +1: monotone increase, 
> +                                       #  0: no monotone restrictions

> +          distribution="gaussian",     # bernoulli, adaboost,
gaussian, 
> +                                       # poisson, and coxph available

> +          n.trees=3000,                # number of trees 
> +          shrinkage=0.005,             # shrinkage or learning rate, 
> +                                       # 0.001 to 0.1 usually work 
> +          interaction.depth=3,         # 1: additive model, 2:
two-way 
> interactions, etc. 
> +          bag.fraction = 0.5,          # subsampling fraction, 0.5 is

> probably best 
> +          train.fraction = 0.5,        # fraction of data for
training, 
> +                                       # first train.fraction*N used 
> for training 
> +          n.minobsinnode = 10,         # minimum total weight needed
in 
> each node 
> +          cv.folds = 5,                # do 5-fold cross-validation 
> +          keep.data=TRUE,              # keep a copy of the dataset 
> with the object 
> +          verbose=FALSE)                # print out progress 
> > 
> > best.iter = gbm.perf(boost.model,method="cv") 
> > pred = predict.gbm(boost.model, test, best.iter) 
> > summary(pred) 
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 > 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420
----------------------------------------------------------------------
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}



More information about the R-help mailing list