[R] How to download and unzip data in a loop

Jon Skoien jon.skoien at jrc.ec.europa.eu
Thu Feb 5 12:11:14 CET 2015


In addition to following Jim's suggestion, you should probably also use 
full.names = TRUE, otherwise you will try to open a connection to files 
in your current directory, not in tmpdir.
Another thing is that the unzipped files appear irregular with respect 
to columns, so read.table might not work too well.

Jon

On 2/5/2015 11:30 AM, jim holtman wrote:
> try taking the quotes off of 'files'
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Wed, Feb 4, 2015 at 5:24 PM, Alexandra Catena <amc5981 at gmail.com> wrote:
>
>> Hi All,
>>
>> I need to loop through and download the past 10 years of met data to a
>> temporary directory.  I then need to unzip it and place it into another
>> directory.
>>
>>
>> year = (2005:2015)
>>
>> for (i in year)
>>    tmpdir = tempdir()
>>    file[i] = file.path(tmpdir, sprintf('724927-23285-%4i.gz', i))
>>    url = sprintf('
>> ftp://ftp.ncdc.noaa.gov/pub/data/noaa/%4i/724927-23285-%4i.gz', i, i)
>>    #file = basename(url)
>>    download.file(url, file[i])
>>    files = dir(tmpdir, '*.gz', full.names=FALSE)
>>    read.table(gzfile('files'))
>>
>>
>>
>> 'file' returns 2015 indices with "/tmp/RtmpKvB4Wz/724927-23285-2015.gz"
>> next to 2015. and files returns 724927-23285-2015.gz.  However, when I try
>> to unzip the gz file using the last line, it says it cannot open the
>> connection and the probable reason is that there is no such file or
>> directory.
>>
>>
>>
>> Thanks,
>> Alexandra
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jon Olav Skøien
Joint Research Centre - European Commission
Institute for Environment and Sustainability (IES)
Climate Risk Management Unit

Via Fermi 2749, TP 100-01,  I-21027 Ispra (VA), ITALY

jon.skoien at jrc.ec.europa.eu
Tel:  +39 0332 789205

Disclaimer: Views expressed in this email are those of the individual 
and do not necessarily represent official views of the European Commission.



More information about the R-help mailing list