[R] Reading XML attriutes in R

Archit Soni soni.archit1989 at gmail.com
Fri Apr 28 21:47:16 CEST 2017


Thanks Ben,  I'll give it a shot.. Thanks again :)

On Apr 28, 2017 18:54, "Ben Tupper" <btupper at bigelow.org> wrote:

> Hi again,
>
> It would be super easy if xml2::xml_attrs() accepted a list of attribute
> names and defaults values like xml2::xml_attr() does, but it doesn't.  Once
> you have a list of character vectors like that returned by your ...
>
> ppt <- x %>% xml_find_all("precipitation") %>% xml_attrs()
>
> ..then you need only try to extract the fields you want.  Perhaps
> something like the following untested steps...
>
> precip <-  tibble::as_tibble(do.call(rbind, lapply(ppt, '[', c('unit',
> 'value', 'type')) ))
>
> colnames(precip) <- c('unit', 'value', 'type')
>
> Bon chance!
> Ben
>
> P.S.  Don't forget to change your email client to send plain text messages
> to this list.  Typically rich text and html emails get turned into hash by
> the R-help list services.
>
> > On Apr 28, 2017, at 4:25 AM, Archit Soni <soni.archit1989 at gmail.com>
> wrote:
> >
> > Thanks Ben, got it working, just want one more help on this,
> >
> > If i have a node like: <precipitation mode="no"/> and in some other city
> it came like:  <precipitation unit="3h" value="0.0925" type="rain"/>
> >
> > How can i make my code to handle this dynamically? I am sorry to ask
> such novice questions but it would be extremely helpful if you could help
> me with this.
> >
> > So, i would want my resulting data set from this code:- ppt <- (x %>%
> xml_find_all("precipitation") %>% xml_attrs())
> >  if mode is no, then the three columns should come and values should be
> NA and if values are populated then as is.
> >
> > Unit     Value      Type
> > NA        NA         NA
> > 3h        0.0925     rain
> >
> > Thanks again and in advance !
> >
> > Archit
> >
> > On Thu, Apr 27, 2017 at 6:27 PM, Ben Tupper <btupper at bigelow.org> wrote:
> > Hi,
> >
> > There might be an easy solution out there already, but I suspect that
> you will need to parse the XML yourself.  The example below uses package
> xml2 not XML but you could do this with either.  The example simply shows
> how to get values out of the XML hierarchy.  Once you have the attributes
> you want in hand you can assemble the elements into a data frame (or a
> tibble from package tibble.)
> >
> > By the way, I had to prepend your example with '<current>'
> >
> > Cheers,
> > Ben
> >
> > ### START
> >
> > library(tidyverse)
> > library(xml2)
> >
> > txt <- "<current><city id=\"2643743\" name=\"London\"><coord
> lon=\"-0.13\" lat=\"51.51\"/><country>GB</country><sun
> rise=\"2017-01-30T07:40:36\" set=\"2017-01-30T16:47:56\"/></city><temperature
> value=\"280.15\" min=\"278.15\" max=\"281.15\" unit=\"kelvin\"/><humidity
> value=\"81\" unit=\"%\"/><pressure value=\"1012\"
> unit=\"hPa\"/><wind><speed value=\"4.6\" name=\"Gentle
> Breeze\"/><gusts/><direction value=\"90\" code=\"E\"
> name=\"East\"/></wind><clouds value=\"90\" name=\"overcast
> clouds\"/><visibility value=\"10000\"/><precipitation
> mode=\"no\"/><weather number=\"701\" value=\"mist\"
> icon=\"50d\"/><lastupdate value=\"2017-01-30T15:50:00\"/></current>"
> >
> > x <- read_xml(txt)
> >
> > windspeed <- x %>%
> >     xml_find_first("wind/speed") %>%
> >     xml_attrs()
> >
> > winddir <- x %>%
> >     xml_find_first("wind/direction") %>%
> >     xml_attrs()
> >
> > windspeed
> > #          value            name
> > #          "4.6" "Gentle Breeze"
> >
> > winddir
> > #  value   code   name
> > #  "90"    "E" "East"
> >
> > ### END
> >
> >
> >
> > > On Apr 27, 2017, at 6:08 AM, Archit Soni <soni.archit1989 at gmail.com>
> wrote:
> > >
> > > Hi All,
> > >
> > > I have a XML file like :
> > >
> > > <city id="2643743" name="London">
> > > <coord lon="-0.13" lat="51.51"/>
> > > <country>GB</country>
> > > <sun rise="2017-01-30T07:40:36" set="2017-01-30T16:47:56"/>
> > > </city>
> > > <temperature value="280.15" min="278.15" max="281.15" unit="kelvin"/>
> > > <humidity value="81" unit="%"/>
> > > <pressure value="1012" unit="hPa"/>
> > > <wind>
> > > <speed value="4.6" name="Gentle Breeze"/>
> > > <gusts/>
> > > <direction value="90" code="E" name="East"/>
> > > </wind>
> > > <clouds value="90" name="overcast clouds"/>
> > > <visibility value="10000"/>
> > > <precipitation mode="no"/>
> > > <weather number="701" value="mist" icon="50d"/>
> > > <lastupdate value="2017-01-30T15:50:00"/>
> > > </current>
> > >
> > > I want to create a data frame out of this XML but
> > > obviously xmlToDataFrame() is not working.
> > >
> > > It has dynamic attributes like for node precipitation , it could have
> > > attributes like value and mode both if there is ppt in some city.
> > >
> > > My basic issue now id to read XML attributes of different nodes and
> convert
> > > it into a data frame, I have scraped many forums but could not find any
> > > help in this.
> > >
> > > For starters, please suggest a solution to parse the value of city
> node and
> > > corresponding id, name, lat, long etc.
> > >
> > > I know I am asking a lot, thanks for reading and cheers! :)
> > >
> > > --
> > > Regards
> > > Archit
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > Ben Tupper
> > Bigelow Laboratory for Ocean Sciences
> > 60 Bigelow Drive, P.O. Box 380
> > East Boothbay, Maine 04544
> > http://www.bigelow.org
> >
> >
> >
> >
> >
> >
> > --
> > Regards
> > Archit
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list