[R] Parsing of HTML files in R

Duncan Temple Lang duncan at research.bell-labs.com
Thu Oct 25 16:24:35 CEST 2001


If my memory serves me correctly, I believe that Dan Veillard's libxml
library provides an adaptation of the XML parser that handles HTML. In
that case, I can add something to the XML package that allows us to
access the HTML parser and use the same interface for both XML and
HTML from within R. I'll take a look and see if this is relatively
easy to do.


Luis Torgo wrote:
> Is there any package similar to the XML package that is able to
> "extract" relevant information from HTML files. Namely, I'm interested
> in obtained data that is represented as a HTML table, into some R-type
> structure.
> Thank you.
> 
> --
> Luis Torgo
>     FEP/LIACC, University of Porto   Phone : (+351) 22 607 88 30
>     Machine Learning Group           Fax   : (+351) 22 600 36 54
>     R. Campo Alegre, 823             email : ltorgo at liacc.up.pt
>     4150 PORTO   -  PORTUGAL         WWW   : http://www.liacc.up.pt/~ltorgo
> 
> 
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-- 
_______________________________________________________________

Duncan Temple Lang                duncan at research.bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-3217
700 Mountain Avenue, Room 2C-259  fax:    (908)582-3340
Murray Hill, NJ  07974-2070       
         http://cm.bell-labs.com/stat/duncan
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list