[R] compress data on read, decompress on write

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Thu Feb 28 23:47:00 CET 2008


Dear Prof. Ripley,

Thanks a lot. I've just looked at Rcompression and it seems to do
exactly what I need, though it requires having zlib and bzip2
installed (and I am not sure if this will deter some windows users).
I'll also check connections.c, which might be a way to go, or else
wait for the release of that version of our package until R 2.7.0 is
out.


Best,

R.


On Thu, Feb 28, 2008 at 8:53 PM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
> One solution is likely to be the Omegahat package Rcompression.
>
>  Otherwise, R does have internal facilities to do internal (gzip)
>  compression and decompression (e.g. see the end of
>  src/main/connections.c), and you could make creative use of serialization
>  to do the compression.
>
>
>  On Thu, 28 Feb 2008, Ramon Diaz-Uriarte wrote:
>
>  > Dear All,
>  >
>  > I'd like to be able to have R store (in a list component) a compressed
>  > data set, and then write it out uncompressed. gzcon and gzfile work in
>  > exactly the opposite direction. What would be a good way to handle
>  > this?
>  >
>  > Details:
>  > ----------
>  >
>  > We have a package that uses C; part of the C output is a large sparse
>  > matrix. This is never manipulated directly by R, but always by the C
>  > code. However, we need to store that data somewhere (inside an R
>  > object) for further calls to the functions in our package. We'd like
>  > to store that matrix as part of the R object (say, as an element of a
>  > list). Ideally, it would be stored in as compressed a way as possible.
>  > Then, when we need to use that information, it would be decompressed
>  > and passed to the C function.
>  >
>  > I guess one way to do it is to have C deal with the compression and
>  > uncompression (e.g., using zlib or the bzip2 libraries) and then use
>  > readBin, etc, from R. But, if I can, I'd like to avoid our C code
>  > having to call zlib, etc, so as to make our package easily portable.
>
>  As from R 2.7.0 you will be able to make use of zlib on effectively all
>  platforms, since it has a public interface on Windows.
>
>
>
>  >
>  > Thanks,
>  >
>  > R.
>  >
>  > --
>  > Ramon Diaz-Uriarte
>  > Statistical Computing Team
>  > Structural Biology and Biocomputing Programme
>  > Spanish National Cancer Centre (CNIO)
>  > http://ligarto.org/rdiaz
>
>  --
>  Brian D. Ripley,                  ripley at stats.ox.ac.uk
>  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>  University of Oxford,             Tel:  +44 1865 272861 (self)
>  1 South Parks Road,                     +44 1865 272866 (PA)
>  Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz



More information about the R-help mailing list