[BioC] GSEABase how to map gene symbols to mouse EntrezId or Affy

Vladimir Morozov vmorozov at als.net
Thu May 15 19:56:26 CEST 2008


Martin,

You are right that disagreement beween human and mouse symblos is the
problem. But you still should get some mapping if translate symbols into
capwords
> sum(!is.na(mget(gss[[1]]@geneIds,org.Mm.egSYMBOL2EG,ifnotfound=NA)))
[1] 0
>
sum(!is.na(mget(capwords(tolower(gss[[1]]@geneIds)),org.Mm.egSYMBOL2EG,i
fnotfound=NA)))
[1] 46
Let's say I will figure out some mapping using ortholog or alias names.
Will I screw the GeneSet data structure by
gss2 <- lapply(gss,function(x){x at geneIds <-
my.mapping(x at geneIds);x at geneIdType@type <- 'EntrezIdentifier'})
?



Vladimir Morozov 



-----Original Message-----
From: Martin Morgan [mailto:mtmorgan at fhcrc.org] 
Sent: Thursday, May 15, 2008 12:56 PM
To: Vladimir Morozov
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] GSEABase how to map gene symbols to mouse EntrezId
or Affy

Hi Vladimir --

"Vladimir Morozov" <vmorozov at als.net> writes:

> Hi
>  
> Any suggestions how to map  gene symbols to mouse EntrezId(preffered) 
> or Affy.
> mapping to Entez apparently is not supported by GSEABase
>> mapIdentifiers(gss,EntrezIdentifier())
> Error in .mapIdentifiers_isMappable(from, to) : 
>   unable to map from 'Symbol' to 'EntrezId'
>     neither GeneIdentifierType has annotation

mapIdentifiers needs to know where to look for the map. I guess the way
you created gss means that it doesn't know about the organism you're
using, and EntrezIdentifier() also doesn't. What you want is

> mapIdentifiers(gss, EntrezIdentifier("org.Mm.eg.db"))
GeneSetCollection
  names: chr5q23, chr16q24 (2 total)
  unique identifiers:  (0 total)
  types in collection:
    geneIdType: EntrezIdentifier (1 total)
    collectionType: BroadCollection (1 total)

Here I'm using (and I guess you are too) the gss that comes from
example(getBroadSets). These are human genes, and have no corresponding
mouse equivalents (see below)...

> Error in GeneSetCollection(lapply(what, mapIdentifiers, to, ..., 
> verbose = verbose)) :
>   error in evaluating the argument 'object' in selecting a method for 
> function 'GeneSetCollection'
>  
>  
> Mapping to Affys works for human, but not for mouse
>> mapIdentifiers(gss, AnnotationIdentifier("hgu95av2.db"))
> GeneSetCollection
>   names: chr5q23, chr16q24 (2 total)
>   unique identifiers: 35089_at, 35090_g_at, ..., 35807_at (79 total)
>   types in collection:
>     geneIdType: AnnotationIdentifier (1 total)
>     collectionType: BroadCollection (1 total)
>> mapIdentifiers(gss, AnnotationIdentifier("mouse4302.db"))
> GeneSetCollection
>   names: chr5q23, chr16q24 (2 total)
>   unique identifiers:  (0 total)
>   types in collection:
>     geneIdType: AnnotationIdentifier (1 total)
>     collectionType: BroadCollection (1 total)

This is becaus the identifiers are not in mouse

> ids <- unique(unlist(geneIds(gss)))
> egs <- mget(ids, revmap(mouse4302ENTREZID), ifnotfound=NA) 
> sum(!sapply(egs, is.na))
[1] 0

>> 
>  
>  
> Thanks
>  
>
> Vladimir Morozov
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor

--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center 1100
Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list