[R] reading file in zip archive

David Winsemius dwinsemius at comcast.net
Thu May 31 16:22:44 CEST 2012


On May 31, 2012, at 6:11 AM, Iain Gallagher wrote:

> Hi Phil
>
> That's it. Thanks.
>
> Will have a read at the docs now and see if I can figure out why  
> leaving the 'r'ead instruction out works. Seems counter-intuitive!

It says that unz uses binary mode. You were specifying text mode. See  
if open="rb" is any more successful.

-- 
David.
>
> Best
>
> Iain
>
>
>
> ________________________________
> From: Phil Spector <spector at stat.berkeley.edu>
> To: Iain Gallagher <iaingallagher at btopenworld.com>
> Cc: r-help <r-help at r-project.org>
> Sent: Thursday, 31 May 2012, 0:06
> Subject: Re: [R] reading file in zip archive
>
> Iain -
>    Do you see the same behaviour if you use
>
> z <- unz(pathToZip, 'x.txt')
>
> instead of
>
> z <- unz(pathToZip, 'x.txt','r')
>
>                     - Phil Spector
>                      Statistical Computing Facility
>                      Department of Statistics
>                      UC Berkeley
>                     spector at stat.berkeley.edu
>
>
> On Wed, 30 May 2012, Iain Gallagher wrote:
>
>> Hi Phil
>>
>> Thanks, but this still doesn't work.
>>
>> Here's a reproducible example (was wrapping my head around these  
>> functions before).
>>
>> x <- as.data.frame(cbind(rep('a',5), rep('b',5)))
>> y <- as.data.frame(cbind(rep('c',5), rep('d',5)))
>>
>> write.table(x, 'x.txt', sep='\t', quote=FALSE)
>> write.table(y, 'y.txt', sep='\t', quote=FALSE)
>>
>> zip('test.zip', files = c('x.txt', 'y.txt'))
>>
>> pathToZip <- paste(getwd(), '/test.zip', sep='')
>>
>> z <- unz(pathToZip, 'x.txt', 'r')
>> zT <- read.table(z, header=FALSE, sep='\t')
>>
>> Error in read.table(z, header = FALSE, sep = "\t") :
>>   seek not enabled for this connection
>>
>> As I said in my previous email readLines fails as well. Rather  
>> strange really.
>>
>> Anyway, as before any advice would be appreciated.
>>
>> Best
>>
>> Iain
>>
>> _________________________________________________________________________________________________
>> From: Phil Spector <spector at stat.berkeley.edu>
>> To: Iain Gallagher <iaingallagher at btopenworld.com>
>> Cc: r-help <r-help at r-project.org>
>> Sent: Wednesday, 30 May 2012, 20:16
>> Subject: Re: [R] reading file in zip archive
>>
>> Iain -
>>     Once you specify the file to unzip in the call to unz, there's no
>> need to repeat the filename in read.table.  Try:
>>
>> z <- unz(pathToZip, 'goCats.txt', 'r')
>> zT <- read.table(z, header=TRUE, sep='\t')
>>
>> (Although I can't reproduce the exact error which you saw.)
>>
>>                     - Phil Spector
>>                     Statistical Computing Facility
>>                     Department of Statistics
>>                     UC Berkeley
>>                     spector at stat.berkeley.edu
>>
>>
>>
>> On Wed, 30 May 2012, Iain Gallagher wrote:
>>
>>> Hi List
>>>
>>> I have a series of zip archives each containing several files. One  
>>> of these files is called
>> goCats.txt and I would like to read it into R from the archive.  
>> It's a simple tab delimited text
>> file.
>>> pathToZip <-'/home/iain/Documents/Work/Results/bovineMacRNAData/ 
>>> deAnalysis/afInfection/commonNorm/twoHrs/af2
>> hrs.zip'
>>>
>>> z <- unz(pathToZip, 'goCats.txt', 'r')
>>> zT <- read.table(z, 'goCats.txt', header=T, sep='\t')
>>>
>>> Error in read.table(z, "goCats.txt", header = T, sep = "\t") :
>>> ? seek not enabled for this connection
>>>
>>>
>>> The same error arises with readLines.
>>>
>>> Can anyone advise?
>>>
>>> Best
>>>
>>> iain
>>>
>>>> sessionInfo()
>>> R version 2.15.0 (2012-03-30)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>> locale:
>>> ?[1] LC_CTYPE=en_GB.utf8?????? LC_NUMERIC=C????????????
>>> ?[3] LC_TIME=en_GB.utf8??????? LC_COLLATE=en_GB.utf8???
>>> ?[5] LC_MONETARY=en_GB.utf8??? LC_MESSAGES=en_GB.utf8??
>>> ?[7] LC_PAPER=C??????????????? LC_NAME=C???????????????
>>> ?[9] LC_ADDRESS=C????????????? LC_TELEPHONE=C??????????
>>> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C?????
>>>
>>> attached base packages:
>>> [1] stats???? graphics? grDevices utils???? datasets? methods??  
>>> base????
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.15.0
>>>     [[alternative HTML version deleted]]
>>>
>>>
>>
>>
>>
>>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list