[BioC] KEGGGraph: some complexed proteins are orphans in graphNEL

Paul Shannon pshannon at systemsbiology.org
Fri May 1 00:23:03 CEST 2009


We have been using the admirable KEGGGraph package to obtain pathways  
in graphNEL form.  It is very useful.

mTor is the signalling pathway we are working with: http://www.genome.jp/dbget-bin/get_pathway?org_name=hsa&mapno=04150

We find that proteins which appear only as members of a complex are  
orphans in the graphNEL.

For instance, "hsa:7248" (TSC1) forms a complex with "hsa: 
7249" (TSC2).  TSC2 is well connected, but its complex partner TSC1
is an orphan.

There are a number of ways to handle this, some quite sophisticated,  
some not.  Once could define a node for the complex, create edges to  
that node, and then specify (with a 'complex membership' edge) that  
TSC1 and TSC2 both belong.

mTor presents a good (though challenging) use case: there are two  
differently-acting complexes which include mTor and GBL.  The third  
member of the complex is different, however, as are the interactions  
the two complexes participate in.   This seems to argue for 'complex'  
being a node type.

One simple improvement, which solves some of the 'orphan complex node'  
problem, could be this workaround:  all members of each complex  
participate in all the interactions which belong to the complex.

Here is some incomplete (but suggestive) evidence of the orphan status  
of TSC1.  A more detailed search reveals that TSC1 is not found in the  
target nodes of any of the edges of g.mTor.

f <- '~/s/data/public/kegg/hsa04150.xml'
g.mTor <- parseKGML2Graph (f)
tsc1 <- 'hsa:7248'
tsc2 <- 'hsa:7249'
tsc1 %in% nodes (g.mTor)  #  TRUE
tsc2 %in% nodes (g.mTor)  #  TRUE
tsc2 %in% names (edges (g.mTor)) # TRUE
tsc1 %in% names (edges (g.mTor)) # TRUE
edges (g.mTor)[[tsc1]]   # character(0)
edges (g.mTor)[[tsc2]]   # "hsa:6009"

Thanks,

  - Paul


sessionInfo ()

R version 2.9.0 (2009-04-17)
i386-apple-darwin8.11.1

locale:
en_US/en_US/en_US/C/en_US/en_US

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
  [1] RBGL_1.20.0         gaggle_1.12.0       rJava_0.6-2          
org.Hs.eg.db_2.2.6  RUnit_0.4.22        KEGG.db_2.2.5        
RSQLite_0.7-1
  [8] DBI_0.2-4           AnnotationDbi_1.6.0 Biobase_2.4.0        
KEGGgraph_1.0.0     graph_1.22.0        XML_2.3-0

loaded via a namespace (and not attached):
[1] cluster_1.11.13 tools_2.9.0



More information about the Bioconductor mailing list