[R] trouble for parsing HTML files

R. Michael Weylandt michael.weylandt at gmail.com
Thu Mar 22 22:12:49 CET 2012


Please give sessionInfo() so we can know your version of XML.

Michael

On Thu, Mar 22, 2012 at 2:04 PM, Julien Velcin
<jvelcin at chirouble.univ-lyon2.fr> wrote:
> I use mac OS 10.5.8 with this version of R:
>
> R version 2.14.1 (2011-12-22)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>
> I've tried the command "RSiteSearch", but with no result.
>
> BTW, I recall that the code I've posted works for some websites.
>
> Julien
>
>
>
>
> 2012/3/22, Milan Bouchet-Valat <nalimilan at club.fr>:
>> Le jeudi 22 mars 2012 à 17:20 +0100, Julien Velcin a écrit :
>>> Hi all,
>>>
>>> Using the XML package, I'm not able to parse some html webpages. Here
>>> is my code and the error message:
>>>
>>> library("XML")
>>> url <- "http://www.huffingtonpost.com/social/GraniteSkyline?action=fans"
>>> doc <- htmlParse(url)
>>>
>>> Error: Namespace prefix ꛀ of attribute (null) is not defined
>>>
>>> I've searched a lot on the Internet, but it's really difficult to find
>>> something useful for R.
>> What versions of R and XML are you using? The code you provided works
>> fine here (R 2.14.1 x86_64 and XML 3.9-4 on Fedora 16). sessionInfo()
>> will help us.
>>
>> BTW, see ?RSiteSearch to search for R content on the Web.
>>
>>
>> Cheers
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list