[BioC] from using biomaRt and r10kcod

Weiwei Shi helprhelp at gmail.com
Mon May 14 22:29:49 CEST 2007


Hi, there:

I happened to re-address this question of codelink probe id to human
entrezgene id. I describe my question using an example:

by using r10kcod package, you can find probe "GE16490" mapped to
"502674", which I assume it is rat entrezgene id. However, when I use
biomaRt to convert all rat entrezgene id in this array to human ones,
I found the following maps involving 502674:

         id MappedID rat.count human.count
4167 296197    11034         1           2
7021 502674    11034         1           2

so, basically, 296197, 502674 and 11034 are all associated with
protein "destrin". To be accurate, 296197 is a rat protein which is
similar to destrin.

However, as shown in
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene
, the other two (11034 and *502674*) are human ids (if I am wrong
here, please correct me).

so my questions are:

1. whether 502674 is a rat entrezgene id or human one?
2. r10kcod is wrong or ncbi is wrong or my understanding is wrong (i
assume the last one :)
3. i found many many-2-many maps in this process of rat to human
entrezgene ids. Like the following:
> t0[t0[,1]== 396527,]
         id MappedID rat.count human.count
6608 396527    54576         9           4
6609 396527    54575         9           4
6610 396527    54600         9           4
6611 396527    54577         9           4
6612 396527    54578         9           4
6613 396527    54579         9           4
6614 396527    54657         9           4
6615 396527    54659         9           4
6616 396527    54658         9           4
> t0[t0[,2]== 54576,]
         id MappedID rat.count human.count
2494 113992    54576         9           4
6608 396527    54576         9           4
6617 396551    54576         9           4
6626 396552    54576         9           4
> t0[t0[,2]== 54577,]
         id MappedID rat.count human.count
2497 113992    54577         9           4
6611 396527    54577         9           4
6620 396551    54577         9           4
6629 396552    54577         9           4

so, basically all the ids are related to different polypeptides
associated with UDP glucuronosyltransferase 1 family. Are there some
other situations causing this many2many mappings?

Sorry for the long questions,

Regards,

-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III



More information about the Bioconductor mailing list