[R] How to use classwt parameter option in RandomForest

Nagu thogiti at gmail.com
Wed May 21 05:30:00 CEST 2008


Hi,

I am trying to model a dataset with the response variable Y, which has
6 levels {  Great, Greater, Greatest, Weak, Weaker, Weakest}, and
predictor variables X, with continuous and factor variables using
random forests in R. The variable Y acts like an ordinal variable, but
I recoded it as factor variable.

I ran a simulation and got OOB estimate of error rate 60%. I validated
against some external datasets and got about 59% misclassification
error. I would like to tinker with classwt option in the function
randomForest to see if I can get a better performance the model. My
confusion arises from how to define these weights. If I say, classwt =
c(3,6,9,1,2,3), how exactly the levels get weighted. If this is a 6X6
matrix, I can put a number in each cell to adjust the weights. How
does classwt option work?

Thank you in advance for any ideas.

Nagu



More information about the R-help mailing list