[R] Format a dataset for use with R with chunking

Mark Finkelstein finkel.mark at gmail.com
Mon Dec 28 22:39:00 CET 2015


The problem is common, I have 100GB of data, but only 8GB of RAM. I was
thinking of transforming the 100GB of data, which right now is in a nonCSV,
fixed row format, to something that R could load quickly and easily in
chunks - sort of like pages perhaps.

I might be able to do this with some SQL server, but I'm unsure how well
this works out with the constant conversion, and I feel there might be a
better approach, since I am particularly interested in speed, as I will
have to go through several iterations with this data, and speed counts.

I was hoping someone much more experienced than I might have a good answer
since there's a lot out there.

Any advice would be very much appreciated.

	[[alternative HTML version deleted]]



More information about the R-help mailing list