[Rd] Decompressing raw vectors in memory

Hadley Wickham hadley at rice.edu
Wed May 2 17:43:22 CEST 2012


>> I'm struggling to decompress a gzip'd raw vector in memory:
>>
>> content<- readBin("http://httpbin.org/gzip", "raw", 1000)
>>
>> memDecompress(content, type = "gzip")
>> # Error in memDecompress(content, type = "gzip") :
>> #  internal error -3 in memDecompress(2)
>>
>> I'm reasonably certain that the file is correctly compressed, because
>> if I save it out to a file, I can read the uncompressed data:
>>
>> tmp<- tempfile()
>> writeBin(content, tmp)
>> readLines(tmp)
>>
>> So that suggests I'm using memDecompress incorrectly.  Any hints?
>
> Headers.

Looking at http://tools.ietf.org/html/rfc1952:

* the first two bytes are id1 and id2, which are 1f 8b as expected

* the third byte is the compression: deflate (as.integer(content[3]))

* the fourth byte is the flag

  rawToBits(content[4])
  [1] 00 00 00 00 00 00 00 00

  which indicates no extra header fields are present

So the header looks ok to me (with my limited knowledge of gzip)

Stripping off the header doesn't seem to help either:

memDecompress(content[-(1:10)], type = "gzip")
# Error in memDecompress(content[-(1:10)], type = "gzip") :
#  internal error -3 in memDecompress(2)

I've read the help for memDecompress but I don't see anything there to help me.

Any more hints?

Thanks!

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-devel mailing list