[R] handle large matrix in R

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Jun 12 13:04:06 CEST 2012


Hi,

As Oliver pointed out, you won't be able to fit all that data into RAM
unless you've got some big iron machine.

Besides using a sparse matrix representation, you might also look at
the "Large memory and out-of-memory data" section here:

http://cran.r-project.org/web/views/HighPerformanceComputing.html

Particularly the ff and bigmemory packages.

-steve

On Tue, Jun 12, 2012 at 6:47 AM, Oliver Ruebenacker <curoli at gmail.com> wrote:
>     Hello Hui,
>
> On Tue, Jun 12, 2012 at 2:12 AM, Hui Wang <huiwang.biostats at gmail.com> wrote:
>> Dear all,
>>
>> I've run into a question of handling large matrices in R. I'd like to
>> define a 70000*70000 matrix in R on Platform:
>> x86_64-apple-darwin9.8.0/x86_64 (64-bit), but it seems to run out of memory
>> to handle this. Is it due to R memory limiting size or RAM of my laptop? If
>> I use a cluster with larger RAM, will that be able to handle this large
>> matrix in R? Thanks much!
>
>  Do you really mean 7e4 by 7e4? That would be 4.9e9 entries. If each
> entry takes 8 bytes (as it typically would on a 64 bit system), you
> would need close to 40 Gigabyte storage for this matrix. I'm not sure
> there is a laptop on the market with that amount of RAM.
>
>  What do you need such a large matrix for? If most of the elements
> are zero, you don't want a regular matrix to hold the data, but use
> some sort of sparse matrix implementation.
>
>     Take care
>     Oliver
>
> --
> Oliver Ruebenacker
> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list