[Rd] inflate zlib compressed data using base R or CRAN package?

Simon Urbanek simon.urbanek at r-project.org
Fri Nov 29 01:48:07 CET 2013


On Nov 27, 2013, at 8:30 PM, Murray Stokely <murray at stokely.org> wrote:

> I think none of these examples describe a zlib compressed data block inside a binary file that the OP asked about, as all of your examples are e.g. prepending gzip or zip headers.
> 
> Greg, is memDecompress what you are looking for?
> 

I think so.

But this is interesting — I think the documentation of memCompress/memDecompress is not quite correct and the parameters are misleading. Although it does mention the gzip headers, it is incorrect since zlib format is not a subset of the gzip format (albeit they use the same compression method), so you cannot extract gzip content using zlib decompression - you’ll get  internal error -3 in memDecompress(2) if you try it since it expects the zlib header which is different form the gzip one. So “gzip” in type is a misnomer - it should say “zlib” since it can neither read nor write the gzip format. Also the documentation should make it clear since it’s pointless to try to use this on gzip contents. The better alternative would be to support both gzip and zlib since R can deal with both — the issue is that it will break code that used type=“gzip” explicitly to mean “zlib” so I’m not sure there is a good way out.

Cheers,
Simon


> 
> On Wed, Nov 27, 2013 at 5:22 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
> 
>> 
>> On 27 November 2013 at 18:38, Dirk Eddelbuettel wrote:
>> |
>> | On 27 November 2013 at 23:49, Dr Gregory Jefferis wrote:
>> | | I have a binary file type that includes a zlib compressed data block
>> (ie
>> | | not gzip). Is anyone aware of a way using base R or a CRAN package to
>> | | decompress this kind of data (from disk or memory). So far I have found
>> | | Rcompression::decompress on omegahat, but I would prefer to keep
>> | | dependencies on CRAN (or bioconductor). I am also trying to avoid
>> | | writing yet another C level interface to part of zlib.
>> |
>> | Unless I am missing something, this is in base R; see help(connections).
>> |
>> | Here is a quick demo:
>> |
>> | R> write.csv(trees, file="/tmp/trees.csv")    # data we all have
>> | R> system("gzip -v /tmp/trees.csv")           # as I am lazy here
>> | /tmp/trees.csv:        50.5% -- replaced with /tmp/trees.csv.gz
>> | R> read.csv(gzfile("/tmp/trees.csv.gz"))      # works out of the box
>> 
>> Oh, and in case you meant zip file containing a data file, that also works.
>> 
>> First converting what I did last
>> 
>> edd at max:/tmp$ gunzip trees.csv.gz
>> edd at max:/tmp$ zip trees.zip trees.csv
>>  adding: trees.csv (deflated 50%)
>> edd at max:/tmp$
>> 
>> Then reading the csv from inside the zip file:
>> 
>> R> read.csv(unz("/tmp/trees.zip", "trees.csv"))
>>    X Girth Height Volume
>> 1   1   8.3     70   10.3
>> 2   2   8.6     65   10.3
>> 3   3   8.8     63   10.2
>> 4   4  10.5     72   16.4
>> 5   5  10.7     81   18.8
>> 6   6  10.8     83   19.7
>> 7   7  11.0     66   15.6
>> 8   8  11.0     75   18.2
>> 9   9  11.1     80   22.6
>> 10 10  11.2     75   19.9
>> 11 11  11.3     79   24.2
>> 12 12  11.4     76   21.0
>> 13 13  11.4     76   21.4
>> 14 14  11.7     69   21.3
>> 15 15  12.0     75   19.1
>> 16 16  12.9     74   22.2
>> 17 17  12.9     85   33.8
>> 18 18  13.3     86   27.4
>> 19 19  13.7     71   25.7
>> 20 20  13.8     64   24.9
>> 21 21  14.0     78   34.5
>> 22 22  14.2     80   31.7
>> 23 23  14.5     74   36.3
>> 24 24  16.0     72   38.3
>> 25 25  16.3     77   42.6
>> 26 26  17.3     81   55.4
>> 27 27  17.5     82   55.7
>> 28 28  17.9     80   58.3
>> 29 29  18.0     80   51.5
>> 30 30  18.0     80   51.0
>> 31 31  20.6     87   77.0
>> R>
>> 
>> Regards, Dirk
>> 
>> --
>> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
>> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 



More information about the R-devel mailing list