[R] bigmemory: Using backing file as alternate to write.big.matrix

Shraddha Pai Shraddha_Pai at camh.net
Thu Mar 21 14:54:49 CET 2013


OK, did a test where I did both - wrote a ~6Mx58 double matrix as a .txt file
(write.big.matrix), but also left the backing file + descriptor file as-is
(rather than deleting it as I usually do). Opened a different R session.
Compared contents of first 100 rows of both, they seem identical.
Size-wise, the .bin file is over twice the size of the .txt file (here .bin
was 2,641Mb and .txt was 1,184Mb).  

So my conclusion is this: if the matrix will be read often by downstream
programs, save as .bin. Code that reads the matrix can just attach it, which
is super fast (0.002s elapsed; in contrast, using read.big.matrix to read
the .txt version took 76s on my machine).
If space is a constraint and the matrix isn't expected to be read in "very
often", then save as text file and read using read.big.matrix.
-----
library(bigmemory)
m <- attach.big.matrix("rawXpr.desc") # attach descriptor -- super fast
n <- read.table("rawXpr.txt",sep="\t",header=F,as.is=T,nrow=100) # same
context saved as txt - read 100 rows for test.
n <- as.matrix(n) # was a data.frame before
sapply(1:nrow(n), function(x) { print(all.equal(n[x,], m[x,])) } )

  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE
 [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
-----






--
View this message in context: http://r.789695.n4.nabble.com/bigmemory-Using-backing-file-as-alternate-to-write-big-matrix-tp4661958p4662055.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list