[R] 404 HTTP not found

Gabor Grothendieck ggrothendieck at gmail.com
Mon Sep 18 06:46:23 CEST 2006


See ?try as in this example:

current.pages <- c("http://www.google.com", "http://www.google.com/test",
  "http://www.yahoo.com")

for(i in seq(along = current.pages)) {
	website <- try(tolower(scan(current.pages[i],
		what="character", sep="\n", quiet=TRUE)))
	if (inherits(website, "try-error")) cat(current.pages[i], "bad\n")
	else cat(current.pages[i], "ok\n")
}



On 9/18/06, Stefan Th. Gries <stgries_lists at arcor.de> wrote:
> Hi
>
> I wrote a script which retrieves links from websites and loads them with scan:
>
> ...
> website<-tolower(scan(current.pages[i], what="character", sep="\n", quiet=TRUE))
> ...
>
> However occasionally, the script finds broken links, such as <http://www.google.com/test>. when the script tries to access such websites, the repeat loop breaks and I get the error message
>
> Error in file(file, "r") : unable to open connection
> In addition: Warning message:
> cannot open: HTTP status was '404 Not Found'
>
> Now my question: is there a way to test whether the target of a link exists that does not result in an error and, thus, discontinues my loop? I looked at the help files for files, scans, connections, and did a search for "404?' in th archives but couldn't find anything. I work with R 2.3.1 patched on Windows XP (both Home and Prof) and would appreciate any pointers ...
> Thanks a lot,
> STG
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list