[R] how to know if a file exists on a remote server?

Baoqiang Cao bqcaomail at gmail.com
Wed Dec 1 00:16:23 CET 2010


Hi Georg,

Your code does work, I mean, it doesn't give me any error message,
which is critical for me because I need use it in a loop and plus I
don't know how to catch error message. Before your message, I was
using download.file but the loop was stopped because of the error
message when a file doesn't exist. So I guess, the option
"method=wget" made the difference.

To summarize (in case it is useful to others), there are (at least)
two ways to download files:

1) Georg Ruß:
 v = download.file(url,destf,method="wget")
if(v!=0) {
#download.file failed
}
#no error message though

2)

Henrique Dallazuanna and Steven Mosher both suggested using RCurl,
here is an example code from Henrique for checking if a file exists on
a server:
"
library(RCurl)
h = basicHeaderGatherer()
Lines <- getURI("http://www.pdb.org/pdb/files/2J0S.1001",
headerfunction = h$update)
h$value()[['status']]

If the status is 404, then not found. If exists then status should be 200.
"

What a productive day!

BC
On Tue, Nov 30, 2010 at 1:34 PM, Georg Ruß <research at georgruss.de> wrote:
> On 30/11/10 10:10:07, Baoqiang Cao wrote:
>> I'd like to download some data files from a remote server, the problem
>> here is that some of the files actually don't exist, which I don't
>> know before try. Just wondering if a function in R could tell me if a
>> file exists on a remote server?
>
> Hi Baoqiang,
>
> try downloading the file with R's download.file() function. Then you
> should examine the returned value.
>
> Citing a part of ?download.file below:
>
>>> Value:
>>> An (invisible) integer code, ‘0’ for success and non-zero for
>>> failure.  For the ‘"wget"’ and ‘"lynx"’ methods this is the status
>>> code returned by the external program.  The ‘"internal"’ method can
>>> return ‘1’, but will in most cases throw an error.
>
> So if you call your download via
>
> v <- download.file(url, destfile, method="wget")
>
> and v is not equal to zero, then the file is likely to be non-existent (at
> least the download failed). Note: the method "internal" doesn't really
> change the value of v, I just tried that. With "wget" it returns "0" for
> success and "2048" (or some other value) for non-success.
>
> Regards,
> Georg.
> --
> Research Assistant
> Otto-von-Guericke-Universität Magdeburg
> research at georgruss.de
> http://research.georgruss.de
>



More information about the R-help mailing list