[R] Re : Large database help

Robert Citek rwcitek at alum.calberkeley.org
Tue May 16 22:57:46 CEST 2006


On May 16, 2006, at 11:19 AM, Prof Brian Ripley wrote:
> Well, there *is* a manual about R Data Import/Export, and this does
> discuss using R with DBMSs with examples.  How about reading it?

Thanks for the pointer:

   http://cran.r-project.org/doc/manuals/R-data.html#Relational- 
databases

Unfortunately, that manual doesn't really answer my question.  My  
question is not about how do I make R interact with a database, but  
rather how do I make R interact with a database containing large sets.

> The point being made is that you can import just the columns you  
> need, and indeed summaries of those columns.

That sounds great in theory.  Now I want to reduce it to practice.   
In the toy problem from the previous post, how can one compute the  
mean of a set of 1e9 numbers?  R has some difficulty generating a  
billion (1e9) number set let alone taking the mean of that set.  To wit:

   bigset <- runif(1e9,0,1e9)

runs out of memory on my system.  I realize that I can do some fancy  
data shuffling and hand-waving to calculate the mean.  But I was  
wondering if R has a module that already abstracts out that magic,  
perhaps using a database.

Any pointers to more detailed reading is greatly appreciated.

Regards,
- Robert
http://www.cwelug.org/downloads
Help others get OpenSource software.  Distribute FLOSS
for Windows, Linux, *BSD, and MacOS X with BitTorrent




More information about the R-help mailing list