[R] Reading XML attriutes in R

Ben Tupper btupper at bigelow.org
Fri Apr 28 15:24:11 CEST 2017


Hi again,

It would be super easy if xml2::xml_attrs() accepted a list of attribute names and defaults values like xml2::xml_attr() does, but it doesn't.  Once you have a list of character vectors like that returned by your ...

ppt <- x %>% xml_find_all("precipitation") %>% xml_attrs()

..then you need only try to extract the fields you want.  Perhaps something like the following untested steps...

precip <-  tibble::as_tibble(do.call(rbind, lapply(ppt, '[', c('unit', 'value', 'type')) ))

colnames(precip) <- c('unit', 'value', 'type')

Bon chance!
Ben

P.S.  Don't forget to change your email client to send plain text messages to this list.  Typically rich text and html emails get turned into hash by the R-help list services.

> On Apr 28, 2017, at 4:25 AM, Archit Soni <soni.archit1989 at gmail.com> wrote:
> 
> Thanks Ben, got it working, just want one more help on this,
> 
> If i have a node like: <precipitation mode="no"/> and in some other city it came like:  <precipitation unit="3h" value="0.0925" type="rain"/>
> 
> How can i make my code to handle this dynamically? I am sorry to ask such novice questions but it would be extremely helpful if you could help me with this.
> 
> So, i would want my resulting data set from this code:- ppt <- (x %>% xml_find_all("precipitation") %>% xml_attrs())
>  if mode is no, then the three columns should come and values should be NA and if values are populated then as is.
> 
> Unit     Value      Type
> NA        NA         NA
> 3h        0.0925     rain
> 
> Thanks again and in advance ! 
> 
> Archit
> 
> On Thu, Apr 27, 2017 at 6:27 PM, Ben Tupper <btupper at bigelow.org> wrote:
> Hi,
> 
> There might be an easy solution out there already, but I suspect that you will need to parse the XML yourself.  The example below uses package xml2 not XML but you could do this with either.  The example simply shows how to get values out of the XML hierarchy.  Once you have the attributes you want in hand you can assemble the elements into a data frame (or a tibble from package tibble.)
> 
> By the way, I had to prepend your example with '<current>'
> 
> Cheers,
> Ben
> 
> ### START
> 
> library(tidyverse)
> library(xml2)
> 
> txt <- "<current><city id=\"2643743\" name=\"London\"><coord lon=\"-0.13\" lat=\"51.51\"/><country>GB</country><sun rise=\"2017-01-30T07:40:36\" set=\"2017-01-30T16:47:56\"/></city><temperature value=\"280.15\" min=\"278.15\" max=\"281.15\" unit=\"kelvin\"/><humidity value=\"81\" unit=\"%\"/><pressure value=\"1012\" unit=\"hPa\"/><wind><speed value=\"4.6\" name=\"Gentle Breeze\"/><gusts/><direction value=\"90\" code=\"E\" name=\"East\"/></wind><clouds value=\"90\" name=\"overcast clouds\"/><visibility value=\"10000\"/><precipitation mode=\"no\"/><weather number=\"701\" value=\"mist\" icon=\"50d\"/><lastupdate value=\"2017-01-30T15:50:00\"/></current>"
> 
> x <- read_xml(txt)
> 
> windspeed <- x %>%
>     xml_find_first("wind/speed") %>%
>     xml_attrs()
> 
> winddir <- x %>%
>     xml_find_first("wind/direction") %>%
>     xml_attrs()
> 
> windspeed
> #          value            name
> #          "4.6" "Gentle Breeze"
> 
> winddir
> #  value   code   name
> #  "90"    "E" "East"
> 
> ### END
> 
> 
> 
> > On Apr 27, 2017, at 6:08 AM, Archit Soni <soni.archit1989 at gmail.com> wrote:
> >
> > Hi All,
> >
> > I have a XML file like :
> >
> > <city id="2643743" name="London">
> > <coord lon="-0.13" lat="51.51"/>
> > <country>GB</country>
> > <sun rise="2017-01-30T07:40:36" set="2017-01-30T16:47:56"/>
> > </city>
> > <temperature value="280.15" min="278.15" max="281.15" unit="kelvin"/>
> > <humidity value="81" unit="%"/>
> > <pressure value="1012" unit="hPa"/>
> > <wind>
> > <speed value="4.6" name="Gentle Breeze"/>
> > <gusts/>
> > <direction value="90" code="E" name="East"/>
> > </wind>
> > <clouds value="90" name="overcast clouds"/>
> > <visibility value="10000"/>
> > <precipitation mode="no"/>
> > <weather number="701" value="mist" icon="50d"/>
> > <lastupdate value="2017-01-30T15:50:00"/>
> > </current>
> >
> > I want to create a data frame out of this XML but
> > obviously xmlToDataFrame() is not working.
> >
> > It has dynamic attributes like for node precipitation , it could have
> > attributes like value and mode both if there is ppt in some city.
> >
> > My basic issue now id to read XML attributes of different nodes and convert
> > it into a data frame, I have scraped many forums but could not find any
> > help in this.
> >
> > For starters, please suggest a solution to parse the value of city node and
> > corresponding id, name, lat, long etc.
> >
> > I know I am asking a lot, thanks for reading and cheers! :)
> >
> > --
> > Regards
> > Archit
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
> 
> 
> 
> 
> 
> 
> -- 
> Regards
> Archit

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org



More information about the R-help mailing list