[R] RCurl::postForm() -- how does one determine what the names are of each form element in an online html form?
tony.breyal at googlemail.com
Wed Dec 10 19:29:39 CET 2008
Thank you Felix and also to the individual who replied off-list.
re: html code -- you are both indeed correct that the form elements
are named within the html code for a simple form, and i thank you both
for letting me know about this. For simple forms i think i will try
and write myself a function which can automatically identify these
elements, probably using the XML package. It looks to me like R only
has to inspect the contents of the <form></form> tag to determine
and was not able to investigate, but thank you for the suggestion in
this direction, it may be that i can learn about this in the future.
for anyone who does a search on this topic, the html form elements in
the html code below are called: "licenseID", "content" and
### html code example start ###
<head> <title>Calais test page</title> </head>
licenseID: <input type="text" name="licenseID" />
<input type="submit" /><br />
content: <br />
<textarea rows="15" cols="80" name="content" ></textarea><br />
paramsXML: <br />
<textarea rows="15" cols="80" name="paramsXML" /></
### html code example end ###
### something like the following would help identify the form element
names i think
#src.file <- [location of the html code above]
html <- htmlTreeParse(src.file, useInternal=TRUE, error=function(...)
xpathApply(html, "//body//form//text()", xmlValue)
 "\r\n\tlicenseID: "
 "\r\n\tcontent: "
 "\r\n\tparamsXML: "
On 10 Dec, 03:06, "Felix Andrews" <fe... at nfrac.org> wrote:
> 2008/12/10 Tony Breyal <tony.bre... at googlemail.com>:
> > Dear R-Help,
> > I am looking into using the Open Calais web service (http://
> > sws.clearforest.com/calaisViewer/) for text mining purposes. I would
> > like to use R to post text into one of the forms on their website.
> > In package RCurl, there is a function called postForm(). This sounds
> > like it would do the job. Unfortunately the URL used in the example is
> > no longer valid (i have emailed the maintainer about this).
> > Question: How does one determine the name of the form elements to use?
> > is there an R function which will print out the names of these
> > elements perhaps?
> > [i am still learning, so please forgive me if i used the wrong
> > terminology.]
> > ### Example from ?postForm ###
> > library(RCurl)
> > # Now looking at POST method for forms.
> > postForm("http://www.speakeasy.org/~cgires/perl_form.cgi",
> > "some_text" = "Duncan",
> > "choice" = "Ho",
> > "radbut" = "eep",
> > "box" = "box1, box2"
> > )
> > ### Example ends ###
> > So in the above code, i believe the form elements are: "some_text",
> > "choice", "redbut" and "box". But how does one find out the names of
> > these form elements if one is not given them previously?
> You need to look at the HTML source of the web page to work out what
> the form elements are called. However, in your case, it is not a
> able to post the data tohttp://sws.clearforest.com/calaisViewer//Bridge.asmx/BridgeMe
> with an element 'content' containing the (url-encoded) text, and an
> element 'type' = 'text/txt'.
> If that works it would return the result in an XML block.
> > I hope that the above made sense, and thank you kindly in advance for
> > any help.
> > Tony Breyal.
> > ______________________________________________
> > R-h... at r-project.org mailing list
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Felix Andrews / 安福立http://www.neurofractal.org/felix/
> 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help