[BioC] hs.Mm.inp.db problem

Marc Carlson mcarlson at fhcrc.org
Thu Nov 12 21:29:11 CET 2009


Hi Iain,

The trouble you are having is because inparanoid uses Jackson lab IDs
(MGI) instead of ensembl protein IDs when representing mouse.

So this script should work better:

library(hom.Mm.inp.db)
library(org.Mm.eg.db)
library(org.Hs.eg.db)

dataIn <- c('Ints7', 'Upp1', 'Cdc2a')
egs <- mget(dataIn,revmap(org.Mm.egSYMBOL))

## this is what you want right here:
mouseProtIds <- mget(unlist(egs),org.Mm.egMGI)  
mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)]

rawHumanProtIds <- mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA)

##etc.

Hope this helps,


  Marc



Iain Gallagher wrote:
> Hi - Just a follow up post.
>
> The title should of course be hom.Mm.inp.db problem and session info is below:
>
>   
>> sessionInfo()
>>     
> R version 2.9.0 (2009-04-17) 
> x86_64-pc-linux-gnu 
>
> locale:
> LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
>
> other attached packages:
> [1] org.Hs.eg.db_2.2.11  org.Mm.eg.db_2.2.11  hom.Mm.inp.db_2.2.11
> [4] RSQLite_0.7-1        DBI_0.2-4            AnnotationDbi_1.6.0 
> [7] Biobase_2.4.1       
>   
> Thanks
>
> Iain
>
> --- On Thu, 12/11/09, Iain Gallagher <iaingallagher at btopenworld.com> wrote:
>
>   
>> From: Iain Gallagher <iaingallagher at btopenworld.com>
>> Subject: [BioC] hs.Mm.inp.db problem
>> To: bioconductor at stat.math.ethz.ch
>> Date: Thursday, 12 November, 2009, 18:41
>> Hello List
>>
>> I am trying to map ~5000 mouse genes to human genes using
>> the inparanoid package and I am failing miserably!
>>
>> Having followed the example in the documentation I can't
>> get any of my 5000 mouse genes converted to human EG ids.
>>
>> Example follows with 3 genes only:
>>
>> rm(list=ls())
>>
>> library(hom.Mm.inp.db)
>> library(org.Mm.eg.db)
>> library(org.Hs.eg.db)
>>
>> #mouse genes in as symbols
>> dataIn <- c('Ints7', 'Upp1', 'Cdc2a')
>>
>> #map these to mouse EG ids
>> egIds <- revmap(org.Mm.egSYMBOL)
>> mapped <- mappedkeys(egIds)
>> egIds <- as.list(egIds[mapped])
>> ind <- which(names(egIds)%in%dataIn)
>> egIdsIn <- egIds[ind]
>> #map these IDs to ENSEMBL protein Ids as used for the
>> inparanoid mapping
>> mouseProtIds <-
>> mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT)
>> mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)]
>>
>> #this is the point of failure!
>> rawHumanProtIds <-
>> mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA)
>>
>>
>> the returned list is full of NA
>>
>> Using biomart on the Ensembl site I can get:
>>
>> Ensembl Transcript ID    Human Ensembl Protein
>> ID
>> ENSMUST00000020099   
>>    ENSP00000397973
>>
>> For example, for Cdc2a, so I know there are homologs there,
>> but for some reason the inparanoid package is not working
>> for me.
>> Using the example in the documentation it does work though
>> so I'm assuming the mistake is with me.
>>
>> Can anyone help with this (more curiosity now - I can get
>> the data through biomart)?
>>
>> Cheers
>>
>> Iain
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>     
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list