[R] trouble for parsing HTML files

Milan Bouchet-Valat nalimilan at club.fr
Thu Mar 22 18:42:44 CET 2012


Le jeudi 22 mars 2012 à 17:20 +0100, Julien Velcin a écrit :
> Hi all,
> 
> Using the XML package, I'm not able to parse some html webpages. Here  
> is my code and the error message:
> 
> library("XML")
> url <- "http://www.huffingtonpost.com/social/GraniteSkyline?action=fans"
> doc <- htmlParse(url)
> 
> Error: Namespace prefix ꛀ of attribute (null) is not defined
> 
> I've searched a lot on the Internet, but it's really difficult to find  
> something useful for R.
What versions of R and XML are you using? The code you provided works
fine here (R 2.14.1 x86_64 and XML 3.9-4 on Fedora 16). sessionInfo()
will help us.

BTW, see ?RSiteSearch to search for R content on the Web.


Cheers



More information about the R-help mailing list