[R] randomforests - how to classify

pdb philb at philbrierley.com
Tue May 4 21:07:43 CEST 2010


I'm experimenting with random forests and want to perform a binary
classification task. 
I've tried some of the sample codes in the help files and things run, but I
get a message to the effect 'you don't have very many unique values in the
target - are you sure you want to do regression?' (sorry, don't know exact
message but r is busy now so can't check).

In reading the help files I see 2 examples, one for classification and one
for regression. To the uninformed - these don't seem much different to each
other. How does rf know to do regression or classification?

## Classification:
iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,

## Regression:
## data(airquality)
ozone.rf <- randomForest(Ozone ~ ., data=airquality, mtry=3,
                         importance=TRUE, na.action=na.omit)

My target variable only has 2 values - why does it want to do regression?
I've entered code just like that in the classification example above. Also
when it asks me 'are you sure you want to do regression' - how do I say 'NO,
do classification please'?

View this message in context: http://r.789695.n4.nabble.com/randomforests-how-to-classify-tp2126166p2126166.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list