[R] randomForest can not handle categorical predictors with more than 32 categories

Sven Garbade sfgarbade at googlemail.com
Thu Nov 11 21:46:05 CET 2010


You can try ctree in package party, but anyway: what is the deeper
sense in a binary split for a variable with more than 32 levels?

Regards, Sven

2010/11/10 Erik Iverson <eriki at ccbr.umn.edu>:
> Well, the error message seems relatively straightforward.
>
> When you run str(x) (you did not provide the data)
>
> you should see 1 or more components are factors that have more than 32
> levels.  Apparently you can't include those predictors in a call
> to randomForest.
>
> You might find the following line of code useful:
>
> which(sapply(x, function(y) nlevels(y) > 32))
>
> Mai Dang wrote:
>>
>> I received this error
>> Error in randomForest.default(m, y, ...) :
>> Can not handle categorical predictors with more than 32 categories.
>>
>> using below code
>>
>> library(randomForest)
>> library(MASS)
>> memory.limit(size=12999)
>> x <- read.csv("D:/train_store_title_view.csv", header=TRUE)
>> x <- na.omit(x)
>> set.seed(131)
>> sales.rf <- randomForest(sales ~ ., data=x, mtry=3,
>> importance=TRUE)
>>
>> My machine (i7) running on 64 bit R with 12 gigs of RAM.
>>
>> Would anyone know how to avoid this error ?
>> Thank You for your reply,
>>
>> Mai Dang
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list