[R] parse XML file

Kai Serschmarn serschmarn at googlemail.com
Wed Jun 29 09:17:14 CEST 2011


Hi all,

this is my first post in this mailing group. I hope that anyboby could  
help me parsing a xml file.
I found this website http://www.omegahat.org/RSXML/gettingStarted.html  
but unfortunately my XML file is not as easy as the one in the example.

Example:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://werdis.dwd.de/css/UNIDART/climateTimeseriesOrderByStation.xsl 
" type="text/xsl"?>
<data xmlns="http://www.unidart.eu/xsd"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.unidart.eu/xsd
     http://werdis.dwd.de/conf/timeseriesExchangeType.xsd">
<stationname value="Aachen">
    <v date="2011-04-01" qualityLevel="high" latitude="50.7839"  
longitude="6.0947" altitude="202" unitA="m" geoQualityLevel="certain"  
unitV="degree C">14.1</v>
    <v date="2011-04-02">17.6</v>
    <v date="2011-04-03">11.5</v>
    <v date="2011-04-04">10.0</v>
    <v date="2011-04-05" qualityLevel="low">9.6</v>
    <v date="2011-04-06">16.0</v>
</stationname>
<stationname value="Ahaus">
    <v date="2011-04-01" qualityLevel="high" latitude="52.0828"  
longitude="6.9417" altitude="45.5" unitA="m" geoQualityLevel="certain"  
unitV="degree C">12.5</v>
    <v date="2011-04-02">15.9</v>
    <v date="2011-04-03">12.0</v>
    <v date="2011-04-04">10.1</v>
    <v date="2011-04-05">8.8</v>
    <v date="2011-04-06">13.5</v>
</stationname>
</data>


I would like to get a table in R like this:

stationname	date		value
Aachen		2011-04-01	14.1
Aachen		2011-04-01	17.6
.
.
.
Ahaus		2011-04-06	13.5

I tried to do this:

doc = xmlRoot(xmlTreeParse("de.dwd.klis.TADM.xml"))
tmp = xmlSApply(doc, function(x) xmlSApply(x, xmlValue))

but the stationname was not parsed because "Aachen" is kind of  
attribute of stationname.

Could anyone give some help?
Thanks,
kai.



More information about the R-help mailing list