[R] Failure to understand namespaces in XML::getNodeSet

Hadley Wickham h.wickham at gmail.com
Tue Jan 31 23:27:25 CET 2017


See the last example in ?xml2::xml_find_all or use xml2::xml2::xml_ns_strip()

Hadley

On Tue, Jan 31, 2017 at 9:43 AM, Mark Sharp <msharp at txbiomed.org> wrote:
> I am trying to read a series of XML files that use a namespace and I have failed, thus far, to discover the proper syntax. I have a reproducible example below. I have two XML character strings defined: one without a namespace and one with. I show that I can successfully extract the node using the XML string without the namespace and fail when using the XML string with the namespace.
>
> Mark
> PS I am having the same problem with the xml2 package and am hoping understanding one with help with the other.
>
> ##
> library(XML)
> ## The first XML text (no_ns_xml) does not have a namespace defined
> no_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet>",
>                "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>                "</WorkSet>")
> l_no_ns_xml <-xmlTreeParse(no_ns_xml, asText = TRUE, getDTD = FALSE,
>                            useInternalNodes = TRUE)
> ## The node is found
> getNodeSet(l_no_ns_xml, "/WorkSet//Description")
>
> ## The second XML text (with_ns_xml) has a namespace defined
> with_ns_xml <- c("<?xml version=\"1.0\" ?>",
>                  "<WorkSet xmlns=\"http://labkey.org/etl/xml\">",
>                  "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>                  "</WorkSet>")
>
> l_with_ns_xml <-xmlTreeParse(with_ns_xml, asText = TRUE, getDTD = FALSE,
>                                useInternalNodes = TRUE)
> ## The node is not found
> getNodeSet(l_with_ns_xml, "/WorkSet//Description")
> ## I attempt to provide the namespace, but fail.
> ns <-  "http://labkey.org/etl/xml"
> names(ns)[1] <- "xmlns"
> getNodeSet(l_with_ns_xml, "/WorkSet//Description", namespaces = ns)
>
> R. Mark Sharp, Ph.D.
> Director of Data Science Core
> Southwest National Primate Research Center
> Texas Biomedical Research Institute
> P.O. Box 760549
> San Antonio, TX 78245-0549
> Telephone: (210)258-9476
> e-mail: msharp at TxBiomed.org
>
>
>
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
http://hadley.nz



More information about the R-help mailing list