[R] RandomForest question
ligges at statistik.uni-dortmund.de
Thu Jul 21 16:31:36 CEST 2005
Arne.Muller at sanofi-aventis.com wrote:
> I'm trying to find out the optimal number of splits (mtry parameter)
> for a randomForest classification. The classification is binary and
> there are 32 explanatory variables (mostly factors with each up to 4
> levels but also some numeric variables) and 575 cases.
> I've seen that although there are only 32 explanatory variables the
> best classification performance is reached when choosing mtry=80. How
> is it possible that more variables can used than there are in columns
> the data frame?
If some of the variables are factors, dummy variables are generated and
you get a larger number of variables in the later process.
> thanks for your help + kind regards,
> [[alternative HTML version deleted]]
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help