[R] Grap Element from Web Page

Sparks, John James jspark4 at uic.edu
Wed Aug 14 07:34:50 CEST 2013


Dear R Helpers,

I would like to pull the CIK number from the web page

http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFT&Find=Search&owner=exclude&action=getcompany

If you put this web page into your browser you will see the CIK number in
red on the left side of the page near the top.

When I try the basic
require(scrapeR)
require(XML)
require(RCurl)
doc
<-htmlTreeParse("http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFT&Find=Search&owner=exclude&action=getcompany")
str(doc)

I get a large number of items in the data frame that I don't know how to
interpret.  Both
tables <- readHTMLTable(doc)

and

list<-xmlToList(doc)

result in errors.

Any (positive) guidance would be much appreciated.

--John J. Sparks, Ph.D.



More information about the R-help mailing list