[R] randomForest can not handle categorical predictors with more than 32 categories

Erik Iverson eriki at ccbr.umn.edu
Wed Nov 10 23:06:40 CET 2010


Well, the error message seems relatively straightforward.

When you run str(x) (you did not provide the data)

you should see 1 or more components are factors that have more than 32
levels.  Apparently you can't include those predictors in a call
to randomForest.

You might find the following line of code useful:

which(sapply(x, function(y) nlevels(y) > 32))

Mai Dang wrote:
> I received this error
> Error in randomForest.default(m, y, ...) :
> Can not handle categorical predictors with more than 32 categories.
> 
> using below code
> 
> library(randomForest)
> library(MASS)
> memory.limit(size=12999)
> x <- read.csv("D:/train_store_title_view.csv", header=TRUE)
> x <- na.omit(x)
> set.seed(131)
> sales.rf <- randomForest(sales ~ ., data=x, mtry=3,
> importance=TRUE)
> 
> My machine (i7) running on 64 bit R with 12 gigs of RAM.
> 
> Would anyone know how to avoid this error ?
> Thank You for your reply,
> 
> Mai Dang
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list