[R] Treatment of xml-stylesheet processing instructions in XML module

Duncan Temple Lang duncan at wald.ucdavis.edu
Thu Apr 7 01:06:19 CEST 2011


Hi Adam

To use XPath and getNodeSet on an XML document,
you will want to use xmlParse() and not xmlTreeParse()
to parse the XML content. So

t = xmlParse(I(a)) # or asText = TRUE
elem = getNodeSet(t, "/rss/channel/item")[[1]]

works fine.

You don't need to specify the root node, but rather the document
in getNodeSet.

Also, if you have the package loaded, you don't need the XML::
prefix before the function  names.

  HTH
    D.

On 4/6/11 11:32 AM, Adam Cooper wrote:
> Hello again,
> Another stumble here that is defeating me.
> 
> I try:
> a<-readLines(url("http://feeds.feedburner.com/grokin"))
> t<-XML::xmlTreeParse(a, ignoreBlanks=TRUE, replaceEntities=FALSE,
> asText=TRUE)
> elem<- XML::getNodeSet(XML::xmlRoot(t),"/rss/channel/item")[[1]]
> 
> And I get:
> Start tag expected, '<' not found
> Error: 1: Start tag expected, '<' not found
> 
> When I modify the second line in "a" to remove the following (just
> leaving the <rss> tag with its attributes), I do not get the error.
> I removed:
> <?xml-stylesheet type=\"text/xsl\" media=\"screen\" href=
> \"/~d/styles/rss2full.xsl\"?><?xml-stylesheet type=\"text/css\" media=
> \"screen\" href=\"http://feeds.feedburner.com/~d/styles/itemcontent.css
> \"?>
> 
> I would have expected the PI to be totally ignored by default.
> Have I missed something??
> 
> Thanks in advance...
> 
> Cheers, Adam
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list