[Rd] XML parsing under R / Extracting nodes’ values

Duncan Temple Lang duncan at wald.ucdavis.edu
Tue May 15 16:08:30 CEST 2007


You can use getNodeSet() as Hin-Tak suggests.
But you will need to do it for each of the target nodes.
So you can use sapply() to loop over these.

However, if these nodes are all children of the same XML node,
you can get the values as

  # This is the document content.
z = "<doc><nbRelations>2</nbRelations>
<nbActors>2</nbActors>
<nbRuns>5</nbRuns>
<nbStep>2000</nbStep></doc>"


 # parse the document
 d = xmlRoot(xmlTreeParse(z, isURL = FALSE))

 as.numeric(xmlSAppy(d, xmlValue))

and now you have a vector with named elements
corresponding to nbRelations, etc.

If these are children of a sub-node in the tree, then
you have to fetch that node first. Hopefully you can
get at that easily using subsetting of the document.
(Otherwise, you can do that with getNodeSet().
But getNodeSet() only works with internal documents
so you need useInternaNodes = TRUE in the call to xmlTreeParse().)

I suggest that you don't assign these to regular,
top-level variables but access the values from the vector.
But if you really need to assign them to individual variables,

xmlSAppy(d, function(node)
                assign(xmlName(node), xmlValue(node), globalenv()))

will do the trick.

Hin-Tak Leung wrote:
> - you should have posted to either R-help or (more appropriately) to
> the omega-help list.
> 
> That said, you need something like this:
> 
> root.node <- xmlTreeParse(x, useInternalNodes = TRUE)
> nbrelation.set <- getNodeSet(root.node, "//nbRelations")
> nbrelation.list <- sapply(nbrelation.set, function(x) { xmlValue(x) } )
> 
> and nbrelation.list now contains the "2" in
> nbRelations as text - you may want to do as.numeric() on it as well.
> 
> Abdelhakim z wrote:
>> Hi,
>> I have an XML file which contains among other nodes :
>>
>> ===myXMLfile.xml===
>> (…)
>> <nbRelations>2</nbRelations>
>> <nbActors>2</nbActors>
>> (...)
>> <nbRuns>5</nbRuns>
>> <nbStep>2000</nbStep>
>> (…)
>> ===End file===
>> I need to extract those values and to make them R variables such as:
>> nbRelations = 2
>> nbActors = 2
>>
>> nbRuns = 5
>> nbSteps = 2000
>>
>> I read the help and have seen the examples of the xml package, it
>> seems that I need to use xmlTreeParse() function but I don't know how
>> exactly as I'm not an R advanced programmer, please can anyone show me
>> how to do that explicitly ?
>>
>> Any help would be much appreciated
>>
>> Thanks,
>>
>> Abdel
>> University of Boumerdès
>> Algeria
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list