[R] readHTLMTable help
Duncan Temple Lang
duncan at wald.ucdavis.edu
Wed Mar 28 07:51:47 CEST 2012
The HTML page is formatted by using tables in each of the cells
of the top-most table. As a result, the simple table is much more
complex. readHTMLTable() is intended for quick and easy tables.
For tables such as this, you have to implement more customized processors.
doc = htmlParse("http://220.127.116.11/climatologia/php/vientoMaximo8.php?IdEstacion=330007&FechaIni=01-1-1980")
tb = getNodeSet(doc, "//table")[]
This gives the top-most table.
xmlSize(tb) tells us the number of rows. We want to skip the first 3 to get to the data.
Then in each of these you can process each row and the cells that have the data.
And the details go on....
On 3/27/12 10:57 AM, Lucas wrote:
> Hello to everyone.
> I´m using this function to download some information from a website.
> This is the URL:
> If you go to that website you´ll find a table with meteorological
> information. One column is called "Intesidad Máxima Diaria", and that is
> the one i need.
> I´ve been traying to extract that column, but I´m unable to do it.
> First I tryed simple to download the complete table and then do some kind
> of filter to extract the column but, for some reason when I call the
> a<-readHTLMTable(url), the table is downloaded in a unfriendly format and I
> can not differentiate the column
> If anyone could help me I´ll appreciate it.
> Thank you.
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help