[R] difficulties with read.table applied to files from URL

David Winsemius dwinsemius at comcast.net
Thu Aug 5 21:38:20 CEST 2010


On Aug 5, 2010, at 3:05 PM, David Winsemius wrote:

>
> On Aug 5, 2010, at 2:48 PM, Archana Dayalu wrote:
>
>> Hello,
>> I am using read.table to read files directly from a public ftp  
>> site. I have
>> a general list of files that may or may not exist in the ftp  
>> directory, but
>> my hope was that R would read the file if it existed and ignored it  
>> if it
>> didn't exist and move on to the next one. However, when R arrives  
>> at a file
>> that does not exist I get the error message "Error in file(file,  
>> "rt") :
>> cannot open the connection" This makes sense, but I was wondering  
>> if there
>> was any way I could circumvent this error message and have R  
>> instead give me
>> a warning message without terminating my entire loop.
>
> Yes.
>
> ?try
>
>
>> Ideally, I would get a
>> warning message saying the connection does not exist, and then have  
>> R skip
>> to the next file.
>> My code is copied below.
>>
>
> Something like these modifications .... untested:

Tried and discovered I need to remove the line feeds that broke the  
file name assignment as well as adding 2 parens to close the if-test.  
With those modifications you seem to be getting some sore of activity.  
Aborting the run after a few access I see:

 > str(r.obj.dat)
  chr "alt.ch4.2000.dat.cont"
 > str(alt.ch4.2000.dat.cont)
'data.frame':	8784 obs. of  10 variables:
  $ DATE: chr  "2000-01-01" "2000-01-01" "2000-01-01" "2000-01-01" ...
  $ TIME: chr  "01:00" "02:00" "03:00" "04:00" ...
  $ DATE: chr  "9999-99-99" "9999-99-99" "9999-99-99" "9999-99-99" ...
  $ TIME: chr  "99:99" "99:99" "99:99" "99:99" ...
  $ CH4 : num  -10000000 1864 1868 1863 1869 ...
  $ ND  : int  0 10 7 10 10 7 10 10 7 10 ...
  $ SD  : num  -10 1.8 0.8 2.5 1.1 ...
  $ F   : int  0 1 1 1 1 1 1 1 1 1 ...
  $ CS  : int  0 0 0 0 0 0 0 0 0 0 ...
  $ REM : int  -99999999 -99999999 -99999999 -99999999 -99999999  
-99999999 -99999999 -99999999 -99999999 -99999999 ...
 >
>
>> hourly.years <- c(2000:2008)
>> hourly.species <- c('ch4','co2','co')
>> station.names <-
>> c 
>> ('alt482n00 
>> ','chm449n00 
>> ','egb444n01 
>> ','etl454n00','fsd449n00','llb454n01','wsa443n00','cdl453n00')
>> for (kk in hourly.years) {
>>   for (i in hourly.species) {
>>       for (nn in station.names) {
>>           file1 <- paste('ftp://gaw.kishou.go.jp/pub/data/current/
>> ',i,'/hourly/y',kk,'/',nn,'.ec.as.cn.',i,'.nl.hr',kk,'.anc',sep='')
>> #ancillary data
>>           file2 <- paste('ftp://gaw.kishou.go.jp/pub/data/current/
>> ',i,'/hourly/y',kk,'/',nn,'.ec.as.cn.',i,'.nl.hr',kk,'.dat',sep='')
>> #concentration data
>>           dumm.anc <-  
>> try( read.table(file1,skip=32,header=F,as.is=T) )
> if (class(dumm.anc) == "try-error"     {} else {
missing ")" about here .............^
>>           colnames(dumm.anc) <- c('DATE','TIME','WD','WS','RH','AT')
>>           r.obj.anc <- paste(substr(nn,1,3),i,kk,'anc.cont',sep='.')
>>           assign(r.obj.anc,dumm.anc)
>                                             }
>>           dumm.dat <-  
>> try( read.table(file2,skip=32,header=F,as.is=T) )
> if (class(dumm.dat) == "try-error"    {} else {  #will skip if error
missing ")" about here .............^
>>           colnames(dumm.dat) <-
>> c('DATE','TIME','DATE','TIME','CH4','ND','SD','F','CS','REM')
>>            r.obj.dat <- paste(substr(nn,1,3),i,kk,'dat.cont',sep='.')
>>            assign(r.obj.dat,dumm.dat)
>                  }
>
> #  --------------presumably these do not depend on the read- 
> tries---------
>>           status<-paste(i,nn,kk,'----EC HOURLY/CONTINUOUS DAT/ANC  
>> read
>> complete',sep=' ')
>>           print(status,quote=F)
>>           }
>>       }
>>   }
>>
>
> -- 
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list