[R] trouble for parsing HTML files

Julien Velcin julien.velcin at univ-lyon2.fr
Fri Mar 23 08:10:40 CET 2012


Here it is:

R version 2.14.2 (2012-02-29)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] XML_3.9-4

Thank you!

Julien

On Mar 22, 2012, at 10:12 PM, R. Michael Weylandt wrote:

> Please give sessionInfo() so we can know your version of XML.
>
> Michael
>
> On Thu, Mar 22, 2012 at 2:04 PM, Julien Velcin
> <jvelcin at chirouble.univ-lyon2.fr> wrote:
>> I use mac OS 10.5.8 with this version of R:
>>
>> R version 2.14.1 (2011-12-22)
>> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>>
>> I've tried the command "RSiteSearch", but with no result.
>>
>> BTW, I recall that the code I've posted works for some websites.
>>
>> Julien
>>
>>
>>
>>
>> 2012/3/22, Milan Bouchet-Valat <nalimilan at club.fr>:
>>> Le jeudi 22 mars 2012 à 17:20 +0100, Julien Velcin a écrit :
>>>> Hi all,
>>>>
>>>> Using the XML package, I'm not able to parse some html webpages.  
>>>> Here
>>>> is my code and the error message:
>>>>
>>>> library("XML")
>>>> url <- "http://www.huffingtonpost.com/social/GraniteSkyline?action=fans 
>>>> "
>>>> doc <- htmlParse(url)
>>>>
>>>> Error: Namespace prefix ꛀ of attribute (null) is not defined
>>>>
>>>> I've searched a lot on the Internet, but it's really difficult to  
>>>> find
>>>> something useful for R.
>>> What versions of R and XML are you using? The code you provided  
>>> works
>>> fine here (R 2.14.1 x86_64 and XML 3.9-4 on Fedora 16).  
>>> sessionInfo()
>>> will help us.
>>>
>>> BTW, see ?RSiteSearch to search for R content on the Web.
>>>
>>>
>>> Cheers
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list