[R] Random Forest AUC

mxkuhn mxkuhn at gmail.com
Sat Oct 23 15:39:06 CEST 2010


I think the issue is that you really can't use the training set to judge this (without resampling). 

For example, k nearest neighbors are not known to over fit, but  a 1nn model will always perfectly predict the training data.

Max

On Oct 23, 2010, at 9:05 AM, "Liaw, Andy" <andy_liaw at merck.com> wrote:

> What Breiman meant is that as the model gets more complex (i.e., as the
> number of trees tends to infinity) the geneeralization error (test set
> error) does not increase.  This does not hold for boosting, for example;
> i.e., you can't "boost forever", which nececitate the need to find the
> optimal number of iterations.  You don't need that with RF.
> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org 
>> [mailto:r-help-bounces at r-project.org] On Behalf Of vioravis
>> Sent: Saturday, October 23, 2010 12:15 AM
>> To: r-help at r-project.org
>> Subject: Re: [R] Random Forest AUC
>> 
>> 
>> Thanks Max and Andy. If the Random Forest is always giving an 
>> AUC of 1, isn't
>> it over fitting??? If not, how do you differentiate this from over
>> fitting??? I believe Random forests are claimed to never over 
>> fit (from the
>> following link).
>> 
>> http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.ht
>> m#features
>> 
>> 
>> Ravishankar R
>> -- 
>> View this message in context: 
>> http://r.789695.n4.nabble.com/Random-Forest-AUC-tp3006649p3008157.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> Notice:  This e-mail message, together with any attachme...{{dropped:11}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list