[BioC] from using biomaRt and r10kcod

James W. MacDonald jmacdon at med.umich.edu
Mon May 14 22:51:49 CEST 2007


Weiwei Shi wrote:
> Hi, there:
> 
> I happened to re-address this question of codelink probe id to human
> entrezgene id. I describe my question using an example:
> 
> by using r10kcod package, you can find probe "GE16490" mapped to
> "502674", which I assume it is rat entrezgene id. However, when I use
> biomaRt to convert all rat entrezgene id in this array to human ones,
> I found the following maps involving 502674:
> 
>          id MappedID rat.count human.count
> 4167 296197    11034         1           2
> 7021 502674    11034         1           2
> 
> so, basically, 296197, 502674 and 11034 are all associated with
> protein "destrin". To be accurate, 296197 is a rat protein which is
> similar to destrin.
> 
> However, as shown in
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene
> , the other two (11034 and *502674*) are human ids (if I am wrong
> here, please correct me).
> 
> so my questions are:
> 
> 1. whether 502674 is a rat entrezgene id or human one?

When I search on that term I get a rat gene. In fact, a quick sample of 
the IDs below indicates that the rat IDs are all rat, and the human IDs 
are all human.


> 2. r10kcod is wrong or ncbi is wrong or my understanding is wrong (i
> assume the last one :)

I think you might have become confused if you did a bunch of queries, 
and thought that 502674 came up as Rattus norvegicus instead of Homo 
sapiens on NCBI.


> 3. i found many many-2-many maps in this process of rat to human
> entrezgene ids. Like the following:
> 
>>t0[t0[,1]== 396527,]
> 
>          id MappedID rat.count human.count
> 6608 396527    54576         9           4
> 6609 396527    54575         9           4
> 6610 396527    54600         9           4
> 6611 396527    54577         9           4
> 6612 396527    54578         9           4
> 6613 396527    54579         9           4
> 6614 396527    54657         9           4
> 6615 396527    54659         9           4
> 6616 396527    54658         9           4
> 
>>t0[t0[,2]== 54576,]
> 
>          id MappedID rat.count human.count
> 2494 113992    54576         9           4
> 6608 396527    54576         9           4
> 6617 396551    54576         9           4
> 6626 396552    54576         9           4
> 
>>t0[t0[,2]== 54577,]
> 
>          id MappedID rat.count human.count
> 2497 113992    54577         9           4
> 6611 396527    54577         9           4
> 6620 396551    54577         9           4
> 6629 396552    54577         9           4
> 
> so, basically all the ids are related to different polypeptides
> associated with UDP glucuronosyltransferase 1 family. Are there some
> other situations causing this many2many mappings?

Not sure I understand the question. Are you asking if there are 
duplicate Entrez Gene Ids that map to the same or very similar genes? In 
my experience, yes. In addition, when you are looking at homology 
mappings it isn't uncommon for a gene in one species to map to several 
closely related genes in another (since they are mapped by homology, and 
the closely related genes are often nearly identical in sequence).

Best,

Jim


> 
> Sorry for the long questions,
> 
> Regards,
> 


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list