[R] Rcurl, postForm()

Sven Duve sduve at hotmail.com
Mon May 28 22:24:35 CEST 2012


On 28/05/12 20:46, Simon Kiss wrote:
> Dear colleagues,
> Could I get some assistance using postForm() to scrape the business names and addresses at this website:
> http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx
>
> I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it.  I'm aware that this is probably a pretty basic question, but I need some help regardless. Yours, Simon Kiss
>
> library(XML)
> library(RCurl)
> library(scrapeR)
> library(RHTMLForms)
> #Set URL
> bus<-c('http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx')
> #Scrape URL
> orig<-getURLContent(url=bus)
> #Parse doc
> doc<-htmlParse(orig[[1]], asText=TRUE)
> #Get The forms
> forms<-getNodeSet(doc, "//form")
> forms[[1]]
> #These are the input nodes
> getNodeSet(forms[[1]], ".//input")
> #These are the select nodes
> getNodeSet(forms[[1]], ".//select")
>
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 905 746 7606
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
Hey Simon,

just had a look at the source of the webpage, if I am not mistaken, this 
involves javascript. I am trying the same on a different page, but 
couldnt get help either.

If you get the solution from somewhere, please let me know.

Sven



More information about the R-help mailing list