[BioC] A question about org.Hs.egMAPCOUNTS

Hervé Pagès hpages at fhcrc.org
Mon Mar 8 19:57:23 CET 2010


Hi Gilbert,

Gilbert Feng wrote:
> Hi, Herve
> 
> Thank you very much for your prompt reply. Yes, I forget that parents nodes
> in GO include all genes contained in children nodes.

It depends which map you are talking about. This is true for
org.Hs.egGO2ALLEGS but not for org.Hs.egGO2EG. For the latter, GO ids
are mapped to the genes that are linked *directly* to them, not to them
or one of their offsprings. This is *the* difference between the 2 maps.

> 
> Oh! I did remove and change subject and texts of that email and keep
> bioconductor at stat.math.ethz.cn as the only recipient. I supposed that should
> be exactly same when I send a post to bioc mailing list directly since it
> looks same in my side.

It looks the same but it's not. There is a lot of stuff in the header of
an email that you don't see and don't control. I'm not sure but
maybe some of this stuff is used by Mailman (the mailing list software)
or by other people's email clients in order to display threads.

> It's good to know that they are different. Next time,
> I'll send my post directly.
> 
> Anyway, I appreciate your kindly help, and have a good day!

You're welcome.

Thanks,
H.

> 
> Gilbert
> 
> 
> On 3/2/10 7:00 PM, "Hervé Pagès" <hpages at fhcrc.org> wrote:
> 
>> Hi Gilbert,
>>
>> It's not a good idea to start a new thread by picking up a random
>> post and pressing the "Reply" button. Then your post will show up
>> in the middle of an existing thread and people will most likely
>> ignore it.
>>
>> Gilbert Feng wrote:
>>> Hello, BioC forks,
>>>
>>> I notice that org.Hs.egMAPCOUNTS reports that org.Hs.egGO2EG is 8245.
>> As explained in the man page for org.Hs.egMAPCOUNTS (see
>> ?org.Hs.egMAPCOUNTS), this is just the number of "keys" that
>> are mapped. In the case of the org.Hs.egGO2EG map, since this
>> map is actually the reverse of the org.Hs.egGO map, that means
>> that the keys are on the right side of the Gene-to-GO mapping,
>> or, said otherwise, that the keys are GO ids, not genes.
>> So this means that the Human genes in org.Hs.eg.db are mapped
>> to 8245 distinct go terms:
>>
>>> count.mappedRkeys(org.Hs.egGO2EG)
>>    [1] 8245
>>> count.mappedRkeys(org.Hs.egGO)
>>    [1] 8245
>>
>> Note that those 2 maps only hold the GO terms that are linked to
>> at least 1 gene.
>>
>>> Are
>>> these 8245 genes are unique or, do all of GO terms contain 8245 human genes
>>> (could be counted many times)?
>> None of them.
>>
>>> Actually, I wonder how many unique human
>>> genes in GO and its subdirectories, BP, MF and CC.
>> Number of Human genes in org.Hs.eg.db that are mapped to at least 1
>> GO term:
>>
>>> count.mappedkeys(org.Hs.egGO)
>>    [1] 17673
>>
>> For the BP, MF and CC ontologies, if you are familiar with GO it should
>> be easy for you to find the GO ids for the 3 top-level nodes of each
>> ontology: GO:0008150 for BP, GO:0003674 for MF and GO:0005575 for CC.
>> Then you can count the number of Human genes that are mapped to at
>> least 1 GO term in the BP ontology with:
>>
>>> count.mappedLkeys(org.Hs.egGO2ALLEGS["GO:0008150"])
>>    [1] 14221
>>
>> Cheers,
>> H.
>>
>>> Is there any function to
>>> retrieve such information easily or I have to write several lines to do
>>> that?
>>>
>>> Thanks a lot!
>>>
>>> Gilbert
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> 
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list