[BioC] How to plot gene on their chromosome?

Hervé Pagès hpages at fhcrc.org
Thu May 21 19:46:44 CEST 2009


Hi Simon,

Not a good idea to start a new thread by replying to a different thread
you started previously. Then it shows up under the previous thread even
if you changed the subject.

more below...

Simon Noël wrote:
> Hello every one.  I have a question.  I have a gene list in a .xls like
> 
> probeID	Symbol
> 1030431	ACSL1
> 4610431	ACTG2
> 4810575	ADAMTSL2
> 1510750	ADH1C
> 4060519	ADORA1
> 5720523	ADRA2A
> 2810482	AHNAK
> 1260270	AIM2
> 4180768	ALAS2
> ...     ...
> 
> I want to plote all of those genes on their chromosome.  How can I do this?

So first you need to map each gene to its chromosome location.

You can use one of the org.*.eg.db annotation packages for
this (pick up the one for your organism):

   http://bioconductor.org/packages/release/data/annotation/

and use the SYMBOL2EG map to map your gene symbols to their corresponding
Entrez IDs and then the CHRLOC map to map your Entrez IDs to their chromosome
locations.

Example:

   library(org.Hs.eg.db)
   mysymbols <- c("ACSL1", "ACTG2", "ADAMTSL2", "ADH1C",
                  "ADORA1", "ADRA2A", "AHNAK", "AIM2", "ALAS2")
   myEgIDs <- unlist(mget(mysymbols, org.Hs.egSYMBOL2EG))
   mylocs <- unname(unlist(mget(myEgIDs, org.Hs.egCHRLOC)))

One thing to be aware of is that those mappings are not necessarily
one-to-one e.g. the same symbol can be associated with different genes:

   > flat <- toTable(org.Hs.egSYMBOL2EG)
   > names(flat)
   [1] "gene_id" "symbol"
   > any(duplicated(flat$gene_id))
   [1] FALSE
   > any(duplicated(flat$symbol))
   [1] TRUE

The same thing happens with the org.Hs.egCHRLOC map (I'm not sure
why we have this though, may be others on the list can explain).

Anyway this explains why 'mylocs' can have more elements than 'mysymbols'.

Cheers,
H.

> 
> Simon Noël
> VP Externe CADEUL
> Association des étudiants et étudiantes en Biochimie, Bio-
> informatique et Microbiologie de l'Université Laval
> CdeC
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list