[R] Decision trees take long or throws memory limit message
minorthreatx at hotmail.com
Fri Jan 30 14:07:42 CET 2015
I try making decision trees and random forest using the packages rpart and party. I'm already stuck at t he first step. Each time when I enter the code either 1. R takes more than an hour. I haven't waited long enough to see if there's a result but it doesn't look like it! When I hit the "stop" button it also freezes and I need to force quit R. This happens with rpart().
2. Or I get the message :
Error: cannot allocate vector of size 5.0 GbIn addition: Warning messages:1: In cbind(RET, tr[[i]]) : Reached total allocation of 16287Mb: see help(memory.size)2: In cbind(RET, tr[[i]]) : Reached total allocation of 16287Mb: see help(memory.size)3: In cbind(RET, tr[[i]]) : Reached total allocation of 16287Mb: see help(memory.size)4: In cbind(RET, tr[[i]]) : Reached total allocation of 16287Mb: see help(memory.size)
When I look at Windows task manager it goes from "in use: 4GB" to use "in use: 14.5 GB" causing it to have no memory left (15.9 GB is the limit on my computer). The trainset is big (almost 9 million records and 13 variables). I already increased the memory.limit() but it didn't work. This happens with ctree(), cforest().
I don't have much technical knowledge and I am a beginner at R. I use the 64-bit version on Windows.
Examples of the code that I used:dt <- rpart(Product ~ Age + TotalChildren + NumberCarsOwned, data=TrainData, method="class", control=rpart.control(minsplit=50, cp=0, xval=0))
Formula <- Product ~ Age + TotalChildren + NumberCarsOwnedctree <- ctree(Formula, data=TrainData)
rf <- cforest(rFormula, data=TrainData)
On a smaller data set (of 18.000 records) it does seem to work...How can I make it work on my dataset? Is there something in the arguments that I should change.
[[alternative HTML version deleted]]
More information about the R-help