[BioC] How to get gene information

Saroj K Mohapatra saroj at vt.edu
Fri May 29 04:10:03 CEST 2009


You can do some of the work within bioconductor with the org. annotation 
packages. Suppose you have a list of 3 human gene symbols.

 > glist
[1] "A1BG" "A2M"  "A2MP"

Using the corresponding "org." package:
 >library("org.Hs.eg.db")

you can map the gene symbols to Entrez gene ids:
 > mget(glist, revmap(org.Hs.egSYMBOL))
$A1BG
[1] "1"

$A2M
[1] "2"

$A2MP
[1] "3"

There are many other mappings available. Look at:
 > ls("package:org.Hs.eg.db")

If the organism is something else, use the appropriate org. package, 
e.g., org.Mm.eg.db
The second term (Mm) is a short form combining the first letter of genus 
name and first letter of species name.

The full list of annoatation packages are available at 
http://www.bioconductor.org/packages/release/data/annotation/

Saroj

Kay Jaja wrote:
> I have a list of 80 genes in a txt file and I am looking to use a data base, for example NCBI to get information on each of these gene. I need get the start and the end base pair position for each gene listed in my file? Any idea how to get started or what to use?
>  
> Your help is greatly appreciated
>
>
>       
> 	[[alternative HTML version deleted]]
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list