[BioC] BioPAX parsing

Frank Kramer dev at frankkramer.de
Thu Sep 20 14:46:05 CEST 2012


Hello,

coming from the Network Reconstruction Methodology I also ran into the 
problem of getting BioPAX data into R.

I wrote an R package, rBiopaxParser, that allows you to parse the .owl 
Biopax export from filesystem into R using the XML package.
Internally you can access, modify, merge and do some other stuff and if 
it's still valid BioPAX it can be exported into .owl again. There are 
alot of convenience function available, see the manual for a complete list.
The internal data representation is in a tabular format, quite close to 
the actual XML/RDF data.

Regulatory pathways can be transformed (although this is subject to loss 
of information, given that only controls are used for these regulatory 
graphs) into graphs, and can be visualized with Rgraphviz.
Parsing is currently restricted to Biopax Level 2, but Im working on
integrating Level 3 already.

There is quite some documentation (and a walkthrough for the NCI 
Biocarta data export) in the vignette and the manual.
You can find the package at
https://github.com/frankkramer/rBiopaxParser

direct link to the vignette is:
https://github.com/frankkramer/rBiopaxParser/blob/master/inst/doc/rBiopaxParserVignette_short.pdf?raw=true

If this would be of interest to more users I can probably polish the 
package some more and try to submit it to bioconductor.
Let me know if you encounter any problems or have ideas for new features!

Best wishes,
Frank

--
Frank Kramer
University Medical Center Göttingen
Department for Medical Statistics
Statistical Bioinformatics
http://www.ams.med.uni-goettingen.de/amsneu/kramer-en.html



More information about the Bioconductor mailing list