[BioC] Help me understand org.Hs.eg.db

Christof Winter winter at biotec.tu-dresden.de
Sat Apr 4 13:56:56 CEST 2009


Daren Tan wrote, On 04.04.2009 06:06:
> I am using two approaches to get EntrezID to genes mapping, as well as
> genes to EntrezID mappings. toTable gives same number of mappings in
> both directions, but mget doesn't. Which approach should I trust and
> why ?
> 
>> dim(toTable(org.Hs.egSYMBOL2EG))
> [1] 39824     2
>> dim(toTable(org.Hs.egSYMBOL))
> [1] 39824     2
> 
>> length(mget(mappedRkeys(org.Hs.egSYMBOL2EG), org.Hs.egSYMBOL2EG))
> [1] 39800
>> length(mget(mappedLkeys(org.Hs.egSYMBOL), org.Hs.egSYMBOL))
> [1] 39824

Dear Daren:

It seems that for some Entrez Gene symbols, there is more than one 
Entrez Gene ID mapped to it:

 > x = mget(mappedRkeys(org.Hs.egSYMBOL2EG), org.Hs.egSYMBOL2EG)
 > sum(listLen(x) > 1)
[1] 24

If you really care about the correct number, you could look up those 
Entrez Gene IDs at NCBI and decide in each case how to count it:

 > x[listLen(x) > 1]

HTH,
Christof

-- 
Christof Winter
Bioinformatics Group
Biotechnologisches Zentrum
Technische Universität Dresden
Tatzberg 47-51
01307 Dresden
Germany



More information about the Bioconductor mailing list