[R] Format a dataset for use with R with chunking

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Tue Dec 29 00:28:55 CET 2015


Have you looked at the High Performance Computing Task View on CRAN?

Whatever you do,  keep in mind that the algorithms you intend to apply will have a strong impact on which data management approach is going to work best. Start small before diving in with all your data, and try successively larger amounts of data to help extrapolate weekday will happen when you process the whole data set. 

In addition,  if you do use SQL, keep in mind that your table schema and index selection can make or break your project (but this is not a SQL support forum).
-- 
Sent from my phone. Please excuse my brevity.

On December 28, 2015 1:39:00 PM PST, Mark Finkelstein <finkel.mark at gmail.com> wrote:
>The problem is common, I have 100GB of data, but only 8GB of RAM. I was
>thinking of transforming the 100GB of data, which right now is in a
>nonCSV,
>fixed row format, to something that R could load quickly and easily in
>chunks - sort of like pages perhaps.
>
>I might be able to do this with some SQL server, but I'm unsure how
>well
>this works out with the constant conversion, and I feel there might be
>a
>better approach, since I am particularly interested in speed, as I will
>have to go through several iterations with this data, and speed counts.
>
>I was hoping someone much more experienced than I might have a good
>answer
>since there's a lot out there.
>
>Any advice would be very much appreciated.
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list