[R] Large data set

jim holtman jholtman at gmail.com
Mon Jul 23 16:49:33 CEST 2012


First of all, try to determine the smallest file you can read with an
empty workspace.  Once you have done that, then break up your file
into that size sets and read them in.  The next question is what do
you want to do with 112M rows of data.  Can you process them a set a
time and then aggregate the results.  I have no problem in reading in
files with 10M rows on a 32-bit version of R on Windows with 3GB of
memory.

So a little more information on "what is the problem you are trying to
solve" would be useful.

On Mon, Jul 23, 2012 at 8:02 AM, Lorcan Treanor
<lorcan.treanor at idiro.com> wrote:
> Hi all,
>
> Have a problem. Trying to read in a data set that has about 112,000,000
> rows and 8 columns and obviously enough it was too big for R to handle. The
> columns are mode up of 2 integer columns and 6 logical columns. The text
> file is about 4.2 Gb in size. Also I have 4 Gb of RAM and 218 Gb of
> available space on the hard drive. I tried the dumpDF function but it was
> too big. Also tried bring in the data is 10 sets of about 12,000,000. Are
> there are other ways of getting around the size of the data.
>
> Regards,
>
> Lorcan
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



More information about the R-help mailing list