[BioC] BioPAX parsing

Oliver Ruebenacker curoli at gmail.com
Fri Jun 15 21:23:26 CEST 2012


     Hello Martin,

  I don't have code in R to test yet, but I do have extensive
experience handling BioPAX in Java, so I'm assuming reading BioPAX
using RJava should not be too difficult.

  The best target format depends on what people would like to do with
the data. For visualization, a bi-partite graph in a popular
graph-layout package should be best. Is there any particular graph
package in BioConductor or R in general you would recommend?

  For actual analysis, people probably have more specific requirements.

  BioPAX is a format based on RDF/OWL, which in turn is based on
organizing data in triples, which could be stored in a three-column
data frame (or perhaps a fourth column for data type). For example
(incomplete, for illustration only):

  ex:mapPhosphorylization   rdf:type   bp:BiochemicalReaction.
  ex:atp   rdf:type   bp:SmallMolecule.
  ex:adp   rdf:type   bp:SmallMolecule.
  ex:map   rdf:type   bp:Protein.
  ex:mapPhosphorylized   rdf:type   bp:Protein.
  ex:mapPhosphorylization   bp:left   ex:atp.
  ex:mapPhosphorylization   bp:left   ex:map.
  ex:mapPhosphorylization   bp:right   ex:adp.
  ex:mapPhosphorylization   bp:right   ex:mapPhosphorylized.

     Take care
     Oliver

On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse
<martin.preusse at googlemail.com> wrote:
> Hi Oliver,
>
> I think there is a lot interest in a bioconductor package!
>
> Personally, I would like to read pathways stored in the BioPAX format into any kind of graph. It's a philosophical question if reactions should have nodes or should sit on the edges :) So far I have not used any R graph package. But I assume there are some very generic packages which are flexible enough to support both direct and bi-partite pathway structure. I used e.g. the JUNG graph API for JAVA extensively.
>
> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only a format to store a pathway. And I would like to bring it back into its natural form: a network!
>
> Do you have any code to test? I have used RJava before. All this RDF and XML file format stuff kind of puzzles me though … :)
>
> Cheers
> Martin
>
>
>
> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker:
>
>> Hello Martin,
>>
>> I'm currently looking into reading BioPAX into R using RJava and
>> OpenRDF Sesame. If there is interest, I may be looking into submitting
>> a package to BioConductor.
>>
>> It would be very helpful if you could tell me what you need the
>> BioPAX data for, and in what form it would be best for you. Possible
>> options are:
>>
>> - A data frame of the RDF/OWL triples
>> - A graph of the RDF/OWL triples
>> - A data frame with one row for each reaction-participant
>> - A bi-partite graph with nodes for reactions and nodes for substances
>> - A with nodes for substances only, with edges for interactions
>> - A genetic interaction graph
>>
>> This list is roughly sorted form the one most easy to the most
>> difficult to provide.
>>
>> Take care
>> Oliver
>>
>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse
>> <martin.preusse at googlemail.com (mailto:martin.preusse at googlemail.com)> wrote:
>> > Many biological pathway resourced provide their data in the BioPAX format (http://www.biopax.org/index.php), a special XML format for biological interaction networks. Examples are pathway commons (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org (http://www.reactome.org/)).
>> >
>> > A JAVA library for parsing BioPAX files exists: http://www.biopax.org/paxtools.php
>> >
>> > Has anybody used BioPAX files with R? Is it possible to read BioPAX files in any R based graph structure? A solution similar to the KEGGgraph package for KEGG pahways would be great, since more and more databases start using BioPAX.
>> >
>> >
>> > Any ideas are appreciated!
>> >
>> > Cheers
>> > Martin
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org (mailto:Bioconductor at r-project.org)
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>>
>>
>>
>>
>>
>> --
>> Oliver Ruebenacker
>> Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
>>
>
>
>



-- 
Oliver Ruebenacker
Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
Knowomics, The Bioinformatics Network (http://www.knowomics.com)
SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)



More information about the Bioconductor mailing list