[R] Working with massive matrices in R

jim holtman jholtman at gmail.com
Tue Apr 19 00:22:15 CEST 2011


It is probably contiguous memory,  I always suggest that you have 3-4X
memory than your largest object to ensure that you have room for
copies that might be made.  So make a request for about 50GB of
memory.

On Mon, Apr 18, 2011 at 4:10 PM, svrieze <vrie0006 at umn.edu> wrote:
> Hello,
>
> I'm (eventually) attempting a singular value decomposition of a 3200 x
> 527829 matrix in R version 2.10.1.  The script is as follows:
> ###---------Begin Script here-------###
> library(Matrix)
>
> snps <- 527829                   ## Number of SNPs
> N <- 3200                        ## Sample size
> y <- rnorm(N, 100,1)               ## simulated phenotype
> system.time(
> ## read in matrix 3200 x 527829
> x <- scan("gedi7.raw", what=rep(0,snps), nmax=N*snps, skip=1))
> system.time(x <- matrix(x,nrow=N,ncol=snps, byrow=TRUE))
> print(object.size(x), units="Mb")
> ###--------End Script----------------####
>
> The scan function finishes without a problem.  "x" is in double precision
> floating point format and takes up 12886.5Mb of memory at the first
> object.size() statement.
>
> When I convert it to a matrix I get an error stating that I cannot allocate
> a vector of size 12.6Gb.  I have requested 31Gb of memory on the server.
> 12.6+ 12.8 = 25.4Gb of used memory.  Is it that R is using considerable
> memory for operations not directly related to storing the matrix objects
> here?  Or is this perhaps a problem of contiguous memory?
>
> Any help is greatly appreciated.
>
> -Scott
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Working-with-massive-matrices-in-R-tp3458561p3458561.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list