[Rd] gzfile & read.table on Win32

Jeff Gentry jgentry at jimmy.harvard.edu
Mon Mar 15 22:17:07 MET 2004


Hello ...

Are there any known problems or even gotchas to look out for when using a
gzfile connection in read.csv/read.table in Windows?

In the package PROcess, available at
www.bioconductor.org/repository/devel/package/html/PROcess.html
there are two files in the PROcess/inst/Test directory which are of the
extension *.csv.gz.

With both files, if I open up a gzfile connection, say:
vv <- gzfile("122402imac40-s-c-192combined i11.csv.gz")
I can then do:
readLines(vv, n=10)

And it works as expected.  However, if I do this:

read.csv(vv)

I get a warning:
Warning: incomplete final line found by readTableHeader on
`c:/repository/checks/PROcess.Rcheck/PROcess/Test/122402imac40-s-c-192combined
i11.csv.gz'

and the results of the read.table are completely broken (basically it
returns a 0 row matrix, with one column (with the first column name listed
in the csv file).  Furthermore, the connection variable itself seems to
get mangled in the process, if I type the variable name (e.g. 'vv' from
above), I get:
> vv
Error in summary.connection(x) : invalid connection

Note that if I manually gunzip the file and then do a 'read.csv' in R,
everything works properly - so it doesn't appear to be the actual file
itself, but somehow related to reading it in as a compressed file.

This is showing up both on R-1.8.1 and R-devel (admittedly a bit out of
date, currently using 2004-03-08 and am trying to update on Windows now).

Thanks
-J



More information about the R-devel mailing list