[BioC] total gene number for a given species in reactome.db

Marc Carlson mcarlson at fhcrc.org
Mon Aug 13 19:54:33 CEST 2012


Hi Gilbert,

If you are using the new reactome.db (the one in devel), then you can do 
this:

rk = keys(reactome.db, keytype="ENTREZID") ## gets all entrez IDs from 
the DB
hk = keys(org.Hs.eg.db, keytype="ENTREZID") ## gets all the entrez IDs 
from most recent org pkg.

## a cursory glance shows that both overlaps are the same size:
table(hk %in% rk)
table(rk %in% hk)


Or if you are a wiz at reactome.db, the latest reactome.db package has 
the ENTIRE reactome database stashed inside.  So you might be able to 
just write a query to it and specify that you only want human entrez IDs


   Marc



On 08/09/2012 10:02 PM, Gang Feng wrote:
> Hello,
>
> I am using reactome.db for over-representive enrichment test, so I wonder how I can get the total gene number for a given species in reactome.db. For example, how many human genes (unique Entrez Ids) are annotated in reactome.db? Is there any simple way to get this number besides counting the shared genes between the annotated genes from "reactomeEXTID2PATHID" and records from "org.Hs.egUNIGENE2EG"? Or retrieve pathways for human in reactome.db, then count the annotated unique genes. Any comment?
>
> I know there is a Reactome Statistics webpage for some species at the Reactome official website, but reactome.db is only updated twice each year, not everyday. I guess the numbers are not accurate for reactome.db .
>
> Thanks
>
> Gilbert
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list