[R] To get more digits in precision of predict function of randomForests

Uwe Ligges ligges at statistik.tu-dortmund.de
Mon Feb 25 18:31:32 CET 2008



Nagu wrote:
> Thank you Uwe Ligges.
> 
> Yes. I had only 50 trees. I come across memory problems running for
> big number of trees. Also, I am going to post my next question in a
> separate thread, but, it does not harm me to ask here. How do I deal
> with large datasets when using randomForests. I have approximately,
> datasets of size 500000X650, and R just can't deal with it (pops up
> memory allocation problems). 

If you want to use all variables at the same time (otherwise use data 
base access), you will get into troubles with less than 4 Gb of RAM or 
so, but it might work well on some 32 Gb machine, I guess.

 > Are there any better ways to deal with
> large datasets in R, for example, Splus had something like bigData
> library.

bigData library only works for some methods such as lm/glm, but not with 
random forests.

Uwe Ligges


> 
> Thank you,
> Nagu
> 
> On Mon, Feb 25, 2008 at 1:56 AM, Uwe Ligges
> <ligges at statistik.tu-dortmund.de> wrote:
>>
>>
>>  Nagu wrote:
>>  > Hi,
>>  >
>>  > I am using randomForests for a classification problem. The predict
>>  > function in the randomForest library, when asked to return the
>>  > probabilities, has precision of two digits after the decimal. I need
>>  > at least four digits of precision for the predicted probabilities. How
>>  > do I achieve this?
>>
>>  For me it gives the desired precision, adapting the
>>  ?predict.randomForest example:
>>
>>  data(iris)
>>  set.seed(111)
>>  ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2))
>>  iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,], ntree = 2000)
>>  iris.pred <- predict(iris.rf, iris[ind == 2,], type = "prob")
>>  iris.pred
>>
>>  Maybe you do not have much more than 1000 trees in your bag?
>>
>>  Uwe Ligges
>>
>>
>>
>>
>>
>>
>>
>>  >
>>  > Thank you,
>>  > Nagu
>>  >
>>  > ______________________________________________
>>  > R-help at r-project.org mailing list
>>  > https://stat.ethz.ch/mailman/listinfo/r-help
>>  > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>  > and provide commented, minimal, self-contained, reproducible code.
>>



More information about the R-help mailing list