[R] Running randomForests on large datasets

Liaw, Andy andy_liaw at merck.com
Wed Feb 27 14:24:47 CET 2008


There are a couple of things you may want to try, if you can load the
data into R and still have enough to spare:

- Run randomForest() with fewer trees, say 10 to start with.

- Run randomForest() with nodesize set to something larger than the
default (5 for classification).  This puts a limit on the size of the
trees being grown.  Try something like 21 and see if that runs, and
adjust accordingly.

HTH,
Andy


From: Nagu

> Hi,
> 
> I am trying to run randomForests on a datasets of size 500000X650 and
> R pops up memory allocation error. Are there any better ways to deal
> with large datasets in R, for example, Splus had something like
> bigData library.
> 
> Thank you,
> Nagu
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachme...{{dropped:15}}



More information about the R-help mailing list