[R] How to import HTML and SQL files

Duncan Temple Lang duncan at wald.ucdavis.edu
Wed Feb 4 16:02:09 CET 2009



Dieter Menne wrote:
> Arup <arup.pramanik27 <at> gmail.com> writes:
> 
>> I can't import any HTML or SQL files into R..:confused: 
> 
> Also confused. HTML and SQL are like apples and bugs.
> 
> For HTML (assume you want to extract stock quotes from a site)
> 
> -- If you have strict XHTML, using package XML might be
>    the best choice, but I doubt you get these nowadays.
> -- Otherwise, read in the file and use regular expressions (grep, 
>    gsub) to parse.


The htmlParse() and htmlTreeParse() functions in the XML package
use the non-strict HTML parser in libxml2 and so the HTML document
can be malformed.  That parser tends to be quite tolerant so that
you get an HTML tree back, even if the ambiguities in the original
HTML document lead to a tree that one might not expect.

I've not had any troubles parsing HTML files with it.

D.

> 
> For SQL: SELECT * from mybase
> 
> -- "Importing" that string does not help very much, this is 
>    a program telling you what to do when you know your database.
> -- You might have a look at package RODBC or RSQLite; details depend on 
>    the database you are going to use.
> 
> Dieter
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list