[R] Re ad HTML table

f.jamitzky f.jamitzky at gmail.com
Mon Nov 19 10:56:31 CET 2007


For fixed numbers of columns you can use 

data.frame(matrix(data, nrow, ncol)) 

in order to parse the XML data.

htmlTreeParse should be rather quick, but in case it is too slow you could
use curl for downloading
the data and xmlstarlet for transformation to XML. Then you can use
xmlTreeParse or even read.csv to read the file into R.


Gamma wrote:
> 
> 
> f.jamitzky wrote:
>> 
>> You can use htmlTreeParse and xpathApply from the XML library.
>> something like:
>> 
>> xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x)
>> xmlValue(x))
>> 
>> should do it.
>> 
> 
> Thank you, any further ideas how to transform the result into a matrix,
> something that R easily could search and find values, i want to use the
> imported data in various calculations (Rmetrics) and hope to automate the
> process somewhat.
> 
> Another thing, the htmlTreeParse takes a while to complete, for a 15 row
> table it takes about 10-15 seconds, considering i am planning to use this
> method on multiple (15-20) tables with up to 1000 rows it might not be the
> ideal solution?
> 

-- 
View this message in context: http://www.nabble.com/Read-HTML-table-tf4832010.html#a13830637
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list