[R] Random Forest AUC

vioravis vioravis at gmail.com
Fri Oct 22 07:19:37 CEST 2010


Guys,

I used Random Forest with a couple of data sets I had to predict for binary
response. In all the cases, the AUC of the training set is coming to be 1.
Is this always the case with random forests? Can someone please clarify
this? 

I have given a simple example, first using logistic regression and then
using random forests to explain the problem. AUC of the random forest is
coming out to be 1.

data(iris)
iris <- iris[(iris$Species != "setosa"),]
iris$Species <- factor(iris$Species)
fit <- glm(Species~.,iris,family=binomial)
train.predict <- predict(fit,newdata = iris,type="response")          
library(ROCR)
plot(performance(prediction(train.predict,iris$Species),"tpr","fpr"),col =
"red")
auc1 <-
performance(prediction(train.predict,iris$Species),"auc")@y.values[[1]]
legend("bottomright",legend=c(paste("Logistic Regression
(AUC=",formatC(auc1,digits=4,format="f"),")",sep="")),  
		col=c("red"), lty=1)


library(randomForest)
fit <- randomForest(Species ~ ., data=iris, ntree=50)
train.predict <- predict(fit,iris,type="prob")[,2]          
plot(performance(prediction(train.predict,iris$Species),"tpr","fpr"),col =
"red")
auc1 <-
performance(prediction(train.predict,iris$Species),"auc")@y.values[[1]]
legend("bottomright",legend=c(paste("Random Forests
(AUC=",formatC(auc1,digits=4,format="f"),")",sep="")),  
		col=c("red"), lty=1)

Thank you.

Regards,
Ravishankar R
-- 
View this message in context: http://r.789695.n4.nabble.com/Random-Forest-AUC-tp3006649p3006649.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list