[R] Can't seem to finish a randomForest.... Just goes and goe s!

Liaw, Andy andy_liaw at merck.com
Mon Apr 5 02:07:06 CEST 2004


When you have fairly large data, _do not use the formula interface_, as a
couple of copies of the data would be made.  Try simply:

Myforest.rf <- randomForest(Mydata[, -46], Mydata[,46], 
                            ntrees=100, mtry=7)

[Note that you don't need to set proximity (not proximities) or importance
to FALSE, as that's the default already.]

You might also want to use do.trace=1 to see if trees are actually being
grown (assuming there's no output buffering as in Rgui on Windows, otherwise
you'll probably want to turn that off).

I had run randomForest on data set much larger than that, without problem,
so I don't imagine your data would be `difficult'.  (I have not used the
Mac, though.)

Andy

> From: David L. Van Brunt, Ph.D.
> 
> Playing with randomForest, samples run fine. But on real data, no go.
> 
> Here's the setup: OS X, same behavior whether I'm using 
> R-Aqua 1.8.1 or the
> Fink compile-of-my-own with X-11, R version 1.8.1.
> 
> This is on OS X 10.3 (aka "Panther"), G4 800Mhz with 512M 
> physical RAM.
> 
> I have not altered the Startup options of R.
> 
> Data set is read in from a text file with "read.table", and 
> has 46 variables
> and 1,855 cases. Trying the following:
> 
> The DV is categorical, 0 or 1. Most of the IV's are either 
> continuous, or
> correctly read in as factors. The largest factor has 30 
> levels.... Only the
> DV seems to need identifying as a factor to force class trees over
> regresssion:
> 
> >Mydata$V46<-as.factor(Mydata$V46)
> >Myforest.rf<-randomForest(V46~.,data=Mydata,ntrees=100,mtry=7
,proximities=FALSE
> , importance=FALSE)
> 
> 5 hours later, R.bin was still taking up 75% of my processor. 
>  When I've
> tried this with larger data, I get errors referring to the 
> buffer (sorry,
> not in front of me right now).
> 
> Any ideas on this? The data don't seem horrifically large. 
> Seems like there
> are a few options for setting memory size, but I'm  not sure 
> which of them
> to try tweaking, or if that's even the issue.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list