[R] problems with large data II

Spencer Graves spencer.graves at pdf.com
Fri Jan 9 15:58:56 CET 2004


      If you can't get more memory, you could read portions of the file 
using "scan(..., skip = ..., nlines = ...)" and then compress the data 
somehow to reduce the size of the object you pass to "randomForest".  
You could run "scan" like this in a loop each time processing, e.g., 10% 
of the data file. 

      Alternatively, you could pass each portion to "randomForest" and 
compare the results from several calls to "randomForest".  This would 
produce a type of cross validation, which might be a wise thing to do, 
anyway. 

      hope this helps. 
      spencer graves

PaTa PaTaS wrote:

>Thank you all for your help. The problem is not only with reading the data (5000 cases times 2000 integer variables, imported either from SPSS or TXT file) into my R 1.8.0 but also with the procedure I would like to use = "randomForest" from library "randomForest". It is not possible to run it with such a data set (because of the insuficient memory exception). Moreover, my data has factors with more than 32 classes, which causes another error.
>
>Could you suggest any solution for my problem? Thank you a lot. 
>____________________________________________________________
>Licitovat nejvyhodnejsi nabídku je postavene na hlavu! Skoda Octavia nyni se zvyhodnenim az 90.000 Kc! http://ad2.seznam.cz/redir.cgi?instance=68740%26url=http://www.skoda-auto.cz/action/fast
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>  
>




More information about the R-help mailing list