[R] CART vs. Random Forest

Thu Sep 26 21:54:06 CEST 2002

We haven't implemented different voting thresholds in the package itself,
but when you predict you can get out votes or probabilities rather than
classes if you want.  The argument type to predict.randomForest is "class"
by default, but can also be "vote" or "prob".  You can use the training set
to figure out what a good threshold is, and then check your results on a
test set.  Then you just use the threshold later.  

I suppose we could implement a threshold that could be supplied to predict,
but then we'd have to work something out for multi-class problems -- several
different cutpoints, I guess.  It's not a priority for Andy or me right now.
I actually like to take a look at the ROC curve anyway, to decide what
tradeoffs are worthwhile.

I'd compare the results by looking at the error rates -- if you can make the
(possibly weighted) error rate lower with one method or the other, that's
the method that ones.

Regards,

Matt

-----Original Message-----
From: Andrew Baek [mailto:andrew at stat.ucla.edu]
Sent: Thursday, September 26, 2002 3:33 PM
To: Wiener, Matthew
Cc: r-help at stat.math.ethz.ch
Subject: RE: [R] CART vs. Random Forest

> One suggestion if making sure you find the 1's is more important than
having
> a low overall error rate:  in rpart, you can specify a loss matrix to say
> that certain kinds of errors are more important than others.  In a random
> forest, you can use different voting thresholds for "1-ness" and "0-ness"
to
> bias things -- that is, instead of just taking majority vote, you might
> require (for example) 85% of the trees to agree for something to be
declared
> in class 0.

If I use loss matrix in "rpart" and different threshold in "RF", how 
can I compare two packages? Well, Andy Liaw told me "classwt" in RF does
not help much. But when I modified priors in rpart, I got totall new 
results. So I thought this should be applied to RF.

Also, I'll appreciate if you tell me how to change the voting threshold in
RF. I couldn't find it in the manual. Thank you.

Andrew

------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.

==============================================================================

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._