[R] RandomForest, Party and Memory Management

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Feb 4 09:32:03 CET 2013


On Sun, 3 Feb 2013, Lorenzo Isella wrote:

> Dear All,
> For a data mining project, I am relying heavily on the RandomForest and Party 
> packages.
> Due to the large size of the data set, I have often memory problems (in 
> particular with the Party package; RandomForest seems to use less memory). I 
> really have two questions at this point
> 1) Please see how I am using the Party and RandomForest packages. Any comment 
> is welcome and useful.
>
>
>
> myparty <- cforest(SalePrice ~ ModelID+
>                  ProductGroup+
>                  ProductGroupDesc+MfgYear+saledate3+saleday+
>                  salemonth,
>                  data = trainRF,
> control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE))
>
>
>
>
> rf_model <- randomForest(SalePrice ~ ModelID+
>                   ProductGroup+
>                   ProductGroupDesc+MfgYear+saledate3+saleday+
>                   salemonth,
>                   data = trainRF,na.action = na.omit,
>  importance=TRUE, do.trace=100, mtry=3,ntree=300)
>
> 2) I have another question: sometimes R crashes after telling me that it is 
> unable to allocate e.g. an array of 1.5 Gb.

Do not use the word 'crash': see the posting guide.  I suspect it 
gives you an error message.

> However, I have 4Gb of ram on my box, so...technically the memory is there, 
> but is there a way to enable R to use more of it?

Yes.  I am surmising this is Windows but you have not told us so. 
See the rw-FAQ.  The real answer is to run a 64-bit OS: your computer 
may have 4GB of RAM, but your OS has a 2GB address space which could 
be raised to 3GB.

>
> Many thanks
>
> Lorenzo
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list