[BioC] RpsiXML issues with latest Biogrid release files

Sara JC Gosline sara.gosline at mail.mcgill.ca
Mon Dec 7 16:03:26 CET 2009

Hello again,

I have recently installed and used RpsiXML to successfully parse the 
latest xml files from intact. However, when I try the same functions 
with the latest version of Biogrid (to obtain assay-specific 
interactions instead of experiment-specific), I get a graph with a 
single node “NA” and 1 interaction. SessionInfo is at the end of the email.

***Parsing xml files to graph:
I used the ‘PCA’ file since it is relatively short:
1 Entries found
Parsing entry 1
Parsing experiments: ...............................................
Parsing interactors:
100% ========================================>
Parsing interactions:
100% ========================================>
>  g
[1] "psimi25Graph"
[1] "RpsiXML"
>  nodes(g)
[1] "NA"
>  edges(g)
[1] "NA"

***Parsing xml file without graph:
To determine if this is something wrong with the parsing, I redo the 
parsing without formatting to a graph object:

Here is the first bit of output:
>  g
interaction entry ( 2009-11-25 ):
[ organism ]: Arabidopsis thaliana Saccharomyces cerevisiae 
Schizosaccharomyces pombe
[ taxonomy ID ]: 3702 4932 4896
[ interactors ]: there are 1214 interactors in total, here are the first 
few ones:
sourceDb sourceId shortLabel uniprotId organismName taxId
<NA> "" "1" "BZR1" NA "Arabidopsis thaliana" "3702"
<NA> "" "2" "GRF6" NA "Arabidopsis thaliana" "3702"
<NA> "" "3" "FUN14" NA "Saccharomyces cerevisiae" "4932"
<NA> "" "4" "UIP4" NA "Saccharomyces cerevisiae" "4932"
<NA> "" "5" "ALO1" NA "Saccharomyces cerevisiae" "4932"
<NA> "" "6" "SPO7" NA "Saccharomyces cerevisiae" "4932"
[ interactions ]: there are 2736 interactions in total, here are the 
first few ones:
interaction ( NA ):
[ source database ]:
[ source experiment ID ]: 1
[ interaction type ]: protein complementation assay
[ experiment ]: pubmed 17681130
[ participant ]: NA NA
[ bait ]: 1
[ bait UniProt ]: NA
[ prey ]: 2
[ prey UniProt ]: NA

So the interactors and interactions are being parsed correctly, but not 
being retrieved properly. When I look at the attributes of each 
interaction I get mostly NA’s:
attributes(g at interactions[[1]])
[1] ""

[1] NA

[1] "protein complementation assay"

[1] "17681130"

[1] "1"

[1] NA

<NA> <NA>

[1] "1"

[1] NA

[1] "2"

[1] NA

[1] NA

[1] NA

[1] "psimi25Interaction"
[1] "RpsiXML"

Is there an easy workaround for this? Maybe where I can manually look up 



>  sessionInfo()
R version 2.8.1 (2008-12-22)


attached base packages:
[1] grid splines tools stats graphics grDevices utils
[8] datasets methods base

other attached packages:
[1] gtools_2.5.0-1 multicore_0.1-3 ppiStats_1.8.0
[4] RColorBrewer_1.0-2 lattice_0.17-17 ScISI_1.14.0
[7] apComplex_2.8.0 ppiData_0.1.13 Rgraphviz_1.20.4
[10] org.Sc.sgd.db_2.2.6 GOstats_2.8.0 Category_2.8.4
[13] genefilter_1.22.0 survival_2.34-1 GO.db_2.2.5
[16] RSQLite_0.7-1 DBI_0.2-4 RpsiXML_1.0.0
[19] RBGL_1.20.0 hypergraph_1.14.0 graph_1.20.0
[22] XML_2.3-0 annotate_1.20.1 xtable_1.5-6
[25] AnnotationDbi_1.4.3 Biobase_2.2.2

loaded via a namespace (and not attached):
[1] cluster_1.11.11 GSEABase_1.4.0

More information about the Bioconductor mailing list