[BioC] Map the enriched chromosome bands to entrez genes

Wolfgang Huber whuber at embl.de
Wed Sep 8 16:34:40 CEST 2010


Dear Xi

yes, the code should be:

library("org.Hs.eg.db")
uu = as.list( revmap(org.Hs.egMAP) )
i = grep("^1q", names(uu))
uu[i]

I get a list with 106 elements, and 1796 unique Entrez-IDs:

 > head(uu[i])
$`1q`
[1] "4030"      "7254"      "100113374" "100302291" "100313962"

$`1q12`
  [1] "2369"      "9557"      "9659"      "9939"      "11243" 
"27444"
  [7] "29765"     "114814"    "171419"    "401131"    "644450" 
"100270895"

$`1q12-q21`
[1] "10262" "10903"

$`1q12-q21.2`
[1] "25832"

$`1q12-q22`
[1] "6063"

$`1q12-q23`
[1] "1805" "4002" "4209"

 > length(unique(unlist(uu[i])))
[1] 1796


sessionInfo() as below - i.e. the same version of org.Hs.eg.db, 2.4.1, 
as you use. I have no idea why you only get 5 Entrez-IDs. What is the 
value of 'i' in your session after running the code above? Can you try 
again, from a clean R session, just to make sure there are no typos / 
remnants of previous expressions?

And, yes, I do think that the notation `1q12` means that the gene is on 
chromosome 1q, and it is not separately annotated in the `1q` list 
element. The creator of the "org.Hs.eg.db" package might have more 
insight here.


	Best wishes
	Wolfgang


Xi Zhao scripsit 08/09/10 11:46:
>
> Dear Huber,
>
> Thanks for replying. I I still have problem retrieving the entrez genes
> on the chromosome sub-arm, such as 1q, 16q.
>
> By running the sample code you gave, I retrieved 5 genes locate on "1q",
> but isn´t "1q" supposed refer to the whole q arm of chromosome 1, which
> should harbor >> than 5 genes...
> uu = as.list( revmap(org.Hs.egMAP) )
> i = grep("^1q", names(uu))
> uu[i]
> $`1q`
> [1] "4030" "7254" "100113374" "100302291" "100313962"
>
> Look at 16q from the results by GOstats, there are 357 genes from 16q
> (appeared in my array), but revmap(org.Hs.egMAP) only gives 5 genes on
> 16q. Does the notation "16q" not mean the whole q arm on chr 16 but only
> a cytoband in package "org.Hs.eg.db"??
>
> id Pvalue OddsRatio ExpCount Count Size
> Chr 16q 1.210938e-88 20.332155 10.038699 113 357
>
>>  get("16q", revmap(org.Hs.egMAP))
> [1] "8136" "140454" "171013" "100125393" "100303743"
>
> And I guess by library("org.Hs.egCHR") you meant library("org.Hs.eg.db")?
>
> Thanks again!
> Xi
>
>
>
>
> R version 2.11.1 (2010-05-31)
> x86_64-apple-darwin9.8.0
>
> locale:
> [1] C
>
> attached base packages:
> [1] tcltk grid stats graphics grDevices utils datasets methods
> [9] base
>
> other attached packages:
> [1] humanCHRLOC_2.1.6 GO.db_2.4.1 org.Hs.eg.db_2.4.1
> [4] qvalue_1.22.0 GOstats_2.14.0 RSQLite_0.9-1
> [7] DBI_0.2-5 graph_1.26.0 Category_2.14.0
> [10] AnnotationDbi_1.10.1 Biobase_2.8.0 ggplot2_0.8.8
> [13] proto_0.3-8 reshape_0.8.3 plyr_1.0.3
>
> loaded via a namespace (and not attached):
> [1] GSEABase_1.10.0 RBGL_1.24.0 XML_3.1-0 annotate_1.26.0
> [5] genefilter_1.30.0 splines_2.11.1 survival_2.35-8 tools_2.11.1
> [9] xtable_1.5-6
>
>
>
> On Sep 8, 2010, at 10:41 AM, Wolfgang Huber wrote:
>
>> Dear Xi,
>>
>> try this:
>>
>> library("org.Hs.egCHR")
>> uu = as.list( revmap(org.Hs.egMAP) )
>> print(uu)
>> i = grep("^1q", names(uu))
>> uu[i]
>>
>> length(unique(unlist(uu[i])))
>> # [1] 1796
>>
>>
>> > sessionInfo()
>> R version 2.12.0 Under development (unstable) (2010-09-07 r52876)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=la_AU.utf8 LC_NUMERIC=C
>> [3] LC_TIME=la_AU.utf8 LC_COLLATE=la_AU.utf8
>> [5] LC_MONETARY=C LC_MESSAGES=la_AU.utf8
>> [7] LC_PAPER=la_AU.utf8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=la_AU.utf8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] org.Hs.eg.db_2.4.1 RSQLite_0.9-2 DBI_0.2-5
>> [4] AnnotationDbi_1.11.4 Biobase_2.9.0 fortunes_1.3-7
>>
>> loaded via a namespace (and not attached):
>> [1] tools_2.12.0
>>
>>
>> Xi Zhao scripsit 08/09/10 10:10:
>>>
>>> Dear list,
>>>
>>> Im struggling retrieving the full list of the entrez geneIDs for each
>>> of the enriched chromosome bands (obtained by "GOstats").
>>>
>>> revmap(org.Hs.egCHR) doesnt give the entrezIDs for the sub-arms, only
>>> for the whole arm:
>>>
>>> mget(c("16", "1q"), revmap(org.Hs.egCHR), ifnotfound=NA) # Map
>>> between Entrez Gene IDs and Chromosomes
>>> $`1q`
>>> [1] NA
>>>
>>> revmap(org.Hs.egMAP) only gives a few genes locate on that
>>> chromosome... (or did I do it wrong?)
>>>
>>> mget(c("16", "1q"), revmap(org.Hs.egMAP), ifnotfound=NA) # Map
>>> between Entrez Gene Identifiers and cytogenetic maps/bands
>>> $`16`
>>> [1] "8720"
>>> $`1q`
>>> [1] "4030" "7254" "100113374" "100302291" "100313962"
>>>
>>> Any suggestion / hint is appreciated!
>>>
>>> Kindest regards,
>>> Xi
>>>
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch <mailto:Bioconductor at stat.math.ethz.ch>
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> --
>>
>>
>> Wolfgang Huber
>> EMBL
>> http://www.embl.de/research/units/genome_biology/huber
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 


Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioconductor mailing list