[BioC] Annotate - gene name to ENSEMBL

Marc Carlson mcarlson at fhcrc.org
Tue Nov 5 19:28:48 CET 2013


Or you can use an annotation package (if you know what organism you are 
searching).

So for example:

library(Mus.musculus)

## Now you if you have a gene name like this:

name <- "pregnancy zone protein"

## You can try and extract it directly (and hope its an exact match like 
this):

select(Mus.musculus, keys=name, columns="ENSEMBL", keytype="GENENAME")

## That will work as long as the name matches what is in the database 
exactly.
## But names create a special problem since they can sometimes be 
written in slightly different ways.


## So instead, you might want to use the keys method to do partial 
matching 1st like is described in this man page:
help('keys,OrganismDb-method')

## That would mean that you could look up a range of "valid" keys like this:
possible <- keys(Mus.musculus, keytype="GENENAME", pattern="pregnancy")

## And then you could choose the key you want and use it to extract 
whatever you want to know.
select(Mus.musculus, keys=possible[1], columns="ENSEMBL", 
keytype="GENENAME")



## OR maybe you are asking a more general question and you just want to 
know which ENSEMBL IDs are matched to any GENENAME that has "pregnancy" 
in the title.  For that you could just call keys and use the column 
argument like this:
keys(Mus.musculus, keytype="ENSEMBL", pattern="pregnancy", 
column="GENENAME")


## OR you might want to combine a more usual use of keys with select to 
get both kinds of information about any gene that has "pregnancy" in the 
name:
select(Mus.musculus, keys(Mus.musculus, keytype="GENENAME", 
pattern="pregnancy"), columns="ENSEMBL", keytype="GENENAME")


Hope this helps,


   Marc





On 11/05/2013 12:21 AM, Hans-Rudolf Hotz wrote:
> Hi Kripa
>
> Use biomaRt
> see: http://www.bioconductor.org/packages/release/bioc/html/biomaRt.html
>
>
> quick example, assuming you are working with mouse, and want ensembl 
> gene ids:
>
> > library(biomaRt)
> > ensembl = useMart("ensembl")
> > mouse.ensembl = useDataset("mmusculus_gene_ensembl",mart=ensembl)
> >
> > getBM(attributes = "ensembl_gene_id", filters = 'mgi_symbol', 
> values=c("Papola"),mart=mouse.ensembl)
>      ensembl_gene_id
> 1 ENSMUSG00000021111
> >
>
>
> Regards, Hans-Rudolf
>
>
>
>
> On 11/05/2013 02:07 AM, Kripa R wrote:
>> Hi everyone,
>>
>> Does anyone know how to go from gene name to ENSEMBL ID?
>>
>> I'm using lumi to analyze my microarray data, however the names get 
>> changed from NuID to gene name when reading in the data.... I'd like 
>> to do pathway analysis but require either ENSEMBL or GO id format
>>
>> Any help would be greatly appreciated,
>>
>> .kripa
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list