[R] Reading huge chunks of data from MySQL into Windows R

Duncan Murdoch murdoch at stats.uwo.ca
Mon Jun 6 15:49:22 CEST 2005


On 6/6/2005 9:30 AM, Dubravko Dolic wrote:
> Dear List,
> 
>  
> 
> I'm trying to use R under Windows on a huge database in MySQL via ODBC
> (technical reasons for this...). Now I want to read tables with some
> 160.000.000 entries into R. I would be lucky if anyone out there has
> some good hints what to consider concerning memory management. I'm not
> sure about the best methods reading such huge files into R. for the
> moment I spilt the whole table into readable parts stick them together
> in R again. 
> 
>  
> 
> Any hints welcome.

Most values in R are stored in 8 byte doubles, so 160,000,000 entries 
will take roughly a gigabyte of storage.  (Half that if they are 
integers or factors.) You are likely to run into problems manipulating 
something that big in Windows, because users are normally only allowed 2 
GB of the memory address space, and it can be fragmented.

I'd suggest developing algorithms that can work on the data a block at a 
time, so that you never need to stick the whole thing together in R at 
once.  Alternatively, switch to a 64 bit platform and install lots of 
memory -- but there are still various 4 GB limits in R, so you may still 
run into trouble.

Duncan Murdoch




More information about the R-help mailing list