[R] Compress string memCompress/Decompress

Matt Shotwell shotwelm at musc.edu
Fri Jul 9 22:21:10 CEST 2010


Erik, 

Can you store the data as a blob? For example:

> #create string, compress with gzip, convert to SQLite blob string
> string <- "gzip this string, store as blob in SQLite database"
> string.gz <- memCompress(string, type="gzip")
> string.sqlite <- paste("x'",paste(string.gz,collapse=""),"'",sep="")

> #create database and table with a BLOB column
> library(RSQLite)
Loading required package: DBI
> con <- dbConnect(dbDriver("SQLite"), "compress.sqlite")
> dbGetQuery(con, "CREATE TABLE Compress (id INTEGER, data BLOB);")
NULL

> #insert the string as a blob
> query <- paste("INSERT INTO Compress (id, data) VALUES (1, ", 
+ string.sqlite, ");", sep="")
> dbGetQuery(con, query)
NULL

> #recover the blob, decompress, and convert back to a string
> result <- dbGetQuery(con, "SELECT data FROM Compress;")
> string.gz <- result[[1]][[1]]
> string <- memDecompress(string.gz, type="gzip")
> rawToChar(string)
[1] "gzip this string, store as blob in SQLite database"


-Matt



On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote:
> Hello,
> 
> I would like to compress a long string (character vector), store the compressed string in the text field of a SQLite database (using RSQLite), and then load the text back into memory and decompress it back into the the original string.  My character vector can be compressed considerably using standard gzip/bzip2 compression.  In theory it should be much faster for me to compress/decompress a long string than to write the whole string to the hard drive and then read it back (not to mention the saved hard drive space).
> 
> I have tried accomplishing this task using memCompress() and memDecompress() without success.  It seems memCompress can only convert a character vector to raw type which cannot be treated as a string.  Does anyone have ideas on how I can go about doing this, especially using the standard base packages?
> 
> Thanks!,
> Erik
> 
> 
> > sessionInfo()
> R version 2.11.0 (2010-04-22) 
> x86_64-apple-darwin9.8.0 
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> loaded via a namespace (and not attached):
> [1] tools_2.11.0
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com



More information about the R-help mailing list