[R] Advice on large data structures

Jim Holtman jholtman at gmail.com
Fri Sep 2 12:33:13 CEST 2011


i would suggest that if you want to use R that you get a 64-bit version with 24GB of memory to start.  if your data is a numeric matrix, you will need 8GB for a single copy.

Do you really need it all in memory at once, or can you partition the problem?  Can you use a database to access the portion you need at any time?

If you only need one, or two, columns at a time, then the use of a database storing the columns might work.  You probably need some more analysis on exactly how you want to solve your problem understanding the limitations of the system.

Sent from my iPad

On Sep 2, 2011, at 1:13, Worik R <worikr at gmail.com> wrote:

> Friends
> 
> I am starting on a (section of the) project where I need to build a matrix
> with on the order of 5 million rows and 200 columns
> 
> I am wondering if I can stay in R.
> 
> I need to do rollapply type operations on the columns, including some that
> will be functions of (windows of) two columns.
> 
> I have been looking at the ff and bigmemory packages but am not sure that
> they will do.
> 
> Before I get too deep can some one offer some wisdom about what the best
> direction to go would be?
> 
> Switching to C/C++ is definitely an option if it is all too hard
> 
> cheers
> Worik
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list