[BioC] A question about org.Hs.egMAPCOUNTS

Hervé Pagès hpages at fhcrc.org
Wed Mar 3 02:00:17 CET 2010


Hi Gilbert,

It's not a good idea to start a new thread by picking up a random
post and pressing the "Reply" button. Then your post will show up
in the middle of an existing thread and people will most likely
ignore it.

Gilbert Feng wrote:
> Hello, BioC forks,
> 
> I notice that org.Hs.egMAPCOUNTS reports that org.Hs.egGO2EG is 8245.

As explained in the man page for org.Hs.egMAPCOUNTS (see
?org.Hs.egMAPCOUNTS), this is just the number of "keys" that
are mapped. In the case of the org.Hs.egGO2EG map, since this
map is actually the reverse of the org.Hs.egGO map, that means
that the keys are on the right side of the Gene-to-GO mapping,
or, said otherwise, that the keys are GO ids, not genes.
So this means that the Human genes in org.Hs.eg.db are mapped
to 8245 distinct go terms:

   > count.mappedRkeys(org.Hs.egGO2EG)
   [1] 8245
   > count.mappedRkeys(org.Hs.egGO)
   [1] 8245

Note that those 2 maps only hold the GO terms that are linked to
at least 1 gene.

> Are
> these 8245 genes are unique or, do all of GO terms contain 8245 human genes
> (could be counted many times)?

None of them.

> Actually, I wonder how many unique human
> genes in GO and its subdirectories, BP, MF and CC.

Number of Human genes in org.Hs.eg.db that are mapped to at least 1
GO term:

   > count.mappedkeys(org.Hs.egGO)
   [1] 17673

For the BP, MF and CC ontologies, if you are familiar with GO it should
be easy for you to find the GO ids for the 3 top-level nodes of each
ontology: GO:0008150 for BP, GO:0003674 for MF and GO:0005575 for CC.
Then you can count the number of Human genes that are mapped to at
least 1 GO term in the BP ontology with:

   > count.mappedLkeys(org.Hs.egGO2ALLEGS["GO:0008150"])
   [1] 14221

Cheers,
H.

> Is there any function to
> retrieve such information easily or I have to write several lines to do
> that?
> 
> Thanks a lot!
> 
> Gilbert
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list