[R] Another big data size problem
ligges at statistik.uni-dortmund.de
Wed Jul 28 09:53:08 CEST 2004
Federico Gherardini wrote:
> Hi all,
> I'm trying to read a 1220 * 20000 table in R but I'm having lot of problems. Basically what it happens is that R.bin starts eating all my memory until it gets about 90%. At that point it locks itself in a uninterruptible sleep status (at least that's what top says) where it just sits there barely using the cpu at all but keeping its tons of memory. I've tried with read.table and scan but none of them did the trick. I've also tried some orrible hack like reading one line a time and gradually combining everything in a matrix using rbind... nope! It seems I can read up to 500 lines in a *decent* time but nothing more. The machine is a 3 GHz P4 with HT and 512 MB RAM running R-1.8.1. Will I have to write a little a C program myself to handle this thing or am I missing something?
If your data is numeric, you will need roughly
1220 * 20000 * 8 / 1024 / 1024 ~~ 200 MB
just to store one copy in memory. If you need more than two copies, your
machine with its 512MB will start to use swap space .....
Hence either use a machine with more memory, or don't use all the data
at once in memory, e.g. by making use of a database.
> Thanks in advance for your help,
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help