[R] Question about random Forest function in R

Liaw, Andy andy_liaw at merck.com
Tue May 29 21:18:30 CEST 2012


Hi Kelly,

The function has a limitation that it cannot handle any column in your "x" that is a categorical variable with more than 32 categories.  One possibility is to see if you can "bin" some of the categories into one to get below 32 categories.

Andy 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Kelly Cool
Sent: Tuesday, May 29, 2012 10:47 AM
To: r-help at r-project.org
Subject: [R] Question about random Forest function in R



Hello, 

I am trying to run the random Forest function on a data.frame using the following code..

myrf <- randomForest (y=sample_data_metal, x=Train, importance=TRUE, proximity=TRUE)


However, an error occurs saying, "can not handle categorical predictors with more than 32 categories". 

My "x=Train" data.frame is quite large and my "y=sample_data_metal" is one column. 

I'm not sure how to go about fixing this error or if there is even a way to get around this error. Thanks in advance for any help. 

	[[alternative HTML version deleted]]

Notice:  This e-mail message, together with any attachme...{{dropped:11}}



More information about the R-help mailing list