[BioC] Question about mget vs. select for annotation package

Hervé Pagès hpages at fhcrc.org
Wed Jul 3 00:26:04 CEST 2013


Hi Christina,

In AnnotationDbi jargon, a probe that matches multiple genes is called
a multiple probe. When using the classic Bimap API, multiple probles are
mapped to NA by default. Unless you use toggleProbes() on the Bimap
object to request the full mapping:

   > map <- toggleProbes(hgu133plus2ENTREZID, "all")

   > mget("213801_x_at", map)
   $`213801_x_at`
   [1] "3921"   "388524" "574040" "6044"   "653162" "730029"

Personally I think that making multiple probes appear that they're
not mapped to any gene is not doing any good. Hopefully at some point
this can be reconsidered.

Cheers,
H.


On 07/02/2013 02:53 PM, Christina Chaivorapol wrote:
> Hi,
>
> I seem to be getting different results depending on if I use select() or
> mget() with the hgu133plus2.db package for a probe with a 1 probe to many
> gene mapping. Does anyone know why there is a discrepancy?
>
>> select(hgu133plus2.db, keys="213801_x_at", cols=c("ENTREZID", "SYMBOL"),
> keytype="PROBEID")
>        PROBEID ENTREZID  SYMBOL
> 1 213801_x_at     3921    RPSA
> 2 213801_x_at   388524 RPSAP58
> 3 213801_x_at   574040  SNORA6
> 4 213801_x_at     6044 SNORA62
> 5 213801_x_at   653162  RPSAP9
> 6 213801_x_at   730029 RPSAP19
> Warning message:
> In .generateExtraRows(tab, keys, jointype) :
>    'select' resulted in 1:many mapping between keys and return rows
>
>> mget("213801_x_at", hgu133plus2ENTREZID)
> $`213801_x_at`
> [1] NA
>
>> sessionInfo()
> R version 3.0.0 (2013-04-03)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] hgu133plus2.db_2.9.0 org.Hs.eg.db_2.9.0   RSQLite_0.11.3
> [4] DBI_0.2-6            AnnotationDbi_1.22.3 Biobase_2.20.0
> [7] BiocGenerics_0.6.0   limma_3.16.2
>
> loaded via a namespace (and not attached):
> [1] IRanges_1.18.0 stats4_3.0.0   tools_3.0.0
>
> Thanks,
> Christina
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list