[R] Compress string memCompress/Decompress

Erik Wright eswright at wisc.edu
Sat Jul 10 02:02:22 CEST 2010


Hi Matt,

This works great, thanks!

At first I got an error message saying BLOB is not implemented in RSQLite.  When I updated to the latest version it worked.

Is there any reason the string needs to be stored as type BLOB?  It seems to work the same when I swap "BLOB" with "TEXT" in the CREATE TABLE command.

Thanks again!,
Erik



On Jul 9, 2010, at 3:21 PM, Matt Shotwell wrote:

> Erik, 
> 
> Can you store the data as a blob? For example:
> 
>> #create string, compress with gzip, convert to SQLite blob string
>> string <- "gzip this string, store as blob in SQLite database"
>> string.gz <- memCompress(string, type="gzip")
>> string.sqlite <- paste("x'",paste(string.gz,collapse=""),"'",sep="")
> 
>> #create database and table with a BLOB column
>> library(RSQLite)
> Loading required package: DBI
>> con <- dbConnect(dbDriver("SQLite"), "compress.sqlite")
>> dbGetQuery(con, "CREATE TABLE Compress (id INTEGER, data BLOB);")
> NULL
> 
>> #insert the string as a blob
>> query <- paste("INSERT INTO Compress (id, data) VALUES (1, ", 
> + string.sqlite, ");", sep="")
>> dbGetQuery(con, query)
> NULL
> 
>> #recover the blob, decompress, and convert back to a string
>> result <- dbGetQuery(con, "SELECT data FROM Compress;")
>> string.gz <- result[[1]][[1]]
>> string <- memDecompress(string.gz, type="gzip")
>> rawToChar(string)
> [1] "gzip this string, store as blob in SQLite database"
> 
> 
> -Matt
> 
> 
> 
> On Fri, 2010-07-09 at 12:51 -0400, Erik Wright wrote:
>> Hello,
>> 
>> I would like to compress a long string (character vector), store the compressed string in the text field of a SQLite database (using RSQLite), and then load the text back into memory and decompress it back into the the original string.  My character vector can be compressed considerably using standard gzip/bzip2 compression.  In theory it should be much faster for me to compress/decompress a long string than to write the whole string to the hard drive and then read it back (not to mention the saved hard drive space).
>> 
>> I have tried accomplishing this task using memCompress() and memDecompress() without success.  It seems memCompress can only convert a character vector to raw type which cannot be treated as a string.  Does anyone have ideas on how I can go about doing this, especially using the standard base packages?
>> 
>> Thanks!,
>> Erik
>> 
>> 
>>> sessionInfo()
>> R version 2.11.0 (2010-04-22) 
>> x86_64-apple-darwin9.8.0 
>> 
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base     
>> 
>> loaded via a namespace (and not attached):
>> [1] tools_2.11.0
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> -- 
> Matthew S. Shotwell
> Graduate Student
> Division of Biostatistics and Epidemiology
> Medical University of South Carolina
> http://biostatmatt.com
>



More information about the R-help mailing list