[R] unexpected behavior from gzfile and unz

Matthew Suderman matthewsuderman at yahoo.com
Wed Dec 19 20:10:18 CET 2007


I figured out the problem: functions
gzfile/unz/file/... create a connection to a file but
do not *open* the connection unless the 'open'
argument  is specified.  

... a little R gotcha for people who use other
programming languages and expect similar
concepts/behavior.

> I get unexpected behavior from "readLines()" and
> "scan()" depending on how the file is opened with
> "gzfile" or "unz".  More specifically:
> 
> > file <- gzfile("file.gz")
> > readLines(file,1)
> [1] "a\tb\tc"
> > readLines(file,1)
> [1] "a\tb\tc"
> > close(file)
> 
> It seems that the stream is rewound between calls to
> readLines.  The same is true if I replace readLines
> with scan.
> 
> However, if I set argument 'open="r"', then
> rewinding
> does not occur:
> 
> > file <- gzfile("file.gz",open="r")
> > readLines(file,1)
> [1] "a\tb\tc"
> > readLines(file,1)
> [1] "1\t2\t3"
> > close(file)
> 
> Once again, I get the same behavior for scan.  The
> rewinding behavior just described also appears if I
> open a zip file with "unz".
> 
> > file <- unz("file.zip", "file.txt")
> > readLines(file,1)
> [1] "a\tb\tc"
> > readLines(file,1)
> [1] "a\tb\tc"
> > close(file)
> 
> If I add the 'open="r"' argument to the call then I
> get an error from readLines:
> 
> > file <- unz("file.zip", "file.txt", open="r")
> > readLines(file,1)
> Error in readLines(file, 1) : seek not enabled for
> this connection
> > close(file)
> 
> 
> > file <- unz("file.zip", "file.txt", open="rb")
> > readLines(file,1)
> Error in readLines(file, 1) : seek not enabled for
> this connection
> > close(file)
> 
> However, if I instead use "scan" to read the file,
> then there are no errors and I get the rewind/no
> rewind behavior described above.
> 
> > file <- unz("file.zip", "file.txt")
> > scan(file,nlines=1,sep="\t",what=character(0))
> Read 3 items
> [1] "a" "b" "c"
> > scan(file,nlines=1,sep="\t",what=character(0))
> Read 3 items
> [1] "a" "b" "c"
> > close(file)
> 
> 
> > file <- unz("file.zip", "file.txt", open="r")
> > scan(file,nlines=1,sep="\t",what=character(0))
> Read 3 items
> [1] "a" "b" "c"
> > scan(file,nlines=1,sep="\t",what=character(0))
> Read 3 items
> [1] "1" "2" "3"
> > close(file)
> 
> Is this a bug?
> 
> Matt
> 
> 
> 



      ____________________________________________________________________________________
Be a better friend, newshound, and



More information about the R-help mailing list