[BioC] Retrieving SNP rs IDs using biomaRt getBM()

Hervé Pagès hpages at fhcrc.org
Wed Nov 21 22:04:30 CET 2012


Hi Sonia,

If you have human SNPs, an alternative is to use a SNPlocs package:

   library(SNPlocs.Hsapiens.dbSNP.20120608)
   ch19_snps <- getSNPlocs("ch19", as.GRanges=TRUE)
   mypos <- c(45412079, 45415640)
   idx <- match(mypos, start(ch19_snps))
   rsids <- mcols(ch19_snps)$RefSNP_id[idx]

This would scale well if you had a lot of positions (e.g. hundreds of
thousands) but you need to work 1 chromosome at a time.

Note that the rs IDs are stored without the "rs" prefix in the GRanges
object returned by getSNPlocs():

   > rsids
   [1] "7412"   "445925"

Cheers,
H.


On 11/21/2012 10:14 AM, Sonia Shah [guest] wrote:
>
> I have a list of chromosomal positions for which I would like to retrieve SNP rs IDs (if present at these locations). I used the following code to try and get the rs IDs at 2 locations.
>
> getBM(
> attributes=c("refsnp_id","chr_name","chrom_start"),
> filters=c("chr_name","chrom_start","chrom_end"), values=list(c(19,19), c(45412079,45415640), c(45412079,45415640)), mart)
>
> I get back the rs IDs for these 2 locations but also get a list of snps that lie within these 2 positions (a total of 82 SNPs are returned with this query).
>
> How do I query the database to return only the rs ids at the 2 specified chromosomal positions?
>
> Many thanks
> Sonia
>
>   -- output of sessionInfo():
>
> R version 2.11.1 (2010-05-31)
> x86_64-redhat-linux-gnu
>
> locale:
>   [1] LC_CTYPE=en_US.iso885915       LC_NUMERIC=C
>   [3] LC_TIME=en_US.iso885915        LC_COLLATE=en_US.iso885915
>   [5] LC_MONETARY=C                  LC_MESSAGES=en_US.iso885915
>   [7] LC_PAPER=en_US.iso885915       LC_NAME=C
>   [9] LC_ADDRESS=C                   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] biomaRt_2.4.0
>
> loaded via a namespace (and not attached):
> [1] RCurl_1.91-1 tools_2.11.1 XML_3.9-4
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list