[R] XML package- accessing nodes based on attributes

Duncan Temple Lang duncan at wald.ucdavis.edu
Mon Feb 9 20:33:04 CET 2009



XPath is your friend here.


getNodeSet(mf,
           '//Characteristic[@Type="File" and @eName="FileTypeId"
                and @eValue="10"]/parent::File
            /Characteristic[@Type="Patient"
                         and @eName="PatientReference"]/@eValue')

I have broken the XPath expression across lines to try to format it
more legibly.

The basic idea is to first find only the File nodes which have the
required Characteristic with the specific values for the
File, FileTypeId and eValue attributes.
Then go back up to the parent <File> element and
then extract the other Characteristic that you want.

You can then call unlist if you want a character vector.

  D.

Skewes,Aaron wrote:
> Hi,
> 
> I have a rather complex xml document that I am attempting to parse based on attributes:
> 
> <Manifest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
>   <!-- eName       : name of the element.
>       eValue       : value of the element. -->
>   <OutputFilePath>D:\CN_data\Agilent\Results\</OutputFilePath>
>   <FilesList>
>     <File>
>         <Characteristic Type="File" eName="FileTypeId" eValue="10"/>
>                 <Characteristic Type="File" eName="FilePath" eValue="D:\CN_data\Agilent\TCGA-06-0875-01A-01D-0387-02_US23502331_251469343372_S01_CGH-v4_10_Apr08.txt"/>
>                 <Characteristic Type ="Patient" eName="PatientReference" eValue="TCGA-06-0875-01A"/>
>                 <Characteristic Type ="Patient" eName="SampleType" eValue="TUMOR"/>
>                 <Characteristic Type ="Patient" eName="SampleMarker" eValue="cy3"/>
>                 <Characteristic Type ="Patient" eName="PatientDateOfBirth" eValue="080808"/>
>                 <Characteristic Type ="Patient" eName="PatientGender" eValue="M"/>
>                 <Characteristic Type ="Patient" eName="PatientSampleConcentration" eValue="20mg"/>
>     </File>
> File>
>         <Characteristic Type="File" eName="FileTypeId" eValue="10"/>
>                 <Characteristic Type="File" eName="FilePath" eValue="D:\CN_data\Agilent\TCGA-06-0875-01A-01D-0387-02_US23502331_251469343372_S02_CGH-v4_10_Apr08.txt"/>
>                 <Characteristic Type ="Patient" eName="PatientReference" eValue="TCGA-06-0875-02A"/>
>                 <Characteristic Type ="Patient" eName="SampleType" eValue="TUMOR"/>
>                 <Characteristic Type ="Patient" eName="SampleMarker" eValue="cy3"/>
>                 <Characteristic Type ="Patient" eName="PatientDateOfBirth" eValue="080808"/>
>                 <Characteristic Type ="Patient" eName="PatientGender" eValue="M"/>
>                 <Characteristic Type ="Patient" eName="PatientSampleConcentration" eValue="20mg"/>
>     </File>
> 
>     <File>
>                 <Characteristic Type="File" eName="FileTypeId" eValue="20"/>
>                 <Characteristic Type="File" eName="FilePath" eValue="D:\CN_data\Agilent\TCGA-06-0875-10A-01D-0387-02_US23502331_251469342195_S01_CGH-v4_10_Apr08.txt"/>
>                 <Characteristic Type ="Patient" eName="PatientReference" eValue="TCGA-06-0875-10A"/>
>                 <Characteristic Type ="Patient" eName="SampleType" eValue="NORMAL"/>
>                 <Characteristic Type ="Patient" eName="SampleMarker" eValue="cy3"/>
>                 <Characteristic Type ="Patient" eName="PatientDateOfBirth" eValue="080808"/>
>                 <Characteristic Type ="Patient" eName="PatientGender" eValue="M"/>
>                 <Characteristic Type ="Patient" eName="PatientSampleConcentration" eValue="20mg"/>
>     </File>
> 
> 
> My requirement is to access eValues at each <File> node based on FileTypeId. For example:
> 
> How can I get the eValue of eName="PatientReference" for all Type="Patient" ,where the <Characteristic Type="File" eName="FileTypeId" eValue="10"/>?
> 
> i.e. "TCGA-06-0875-01A" and "TCGA-06-0875-02A"
> 
> 
> For the life of me, I can not get this to work!
> 
> Thanks,
> -Aaron
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list