[BioC] Accession ID to Chromosome Name and Start-End

James W. MacDonald jmacdon at med.umich.edu
Fri Oct 3 14:41:13 CEST 2008


Hi Gundala,

Gundala Viswanath wrote:
> Dear experts,
> 
> Given the accession IDs such as these:
> 
> How can I extract the "chromosome name", "start" and "end" position
> of each ID, with BioConductor.
> 
> AB002292
> AB002296
> AB002298
> AB002303
> ..
> EF565109
> K03493
> L36149
> M16404
> X80391
> Z25470
> 
> I tried this, but it gives me so many coordinates sets instead of just
> 3 (corresponding
> to query).
> 
>> library(biomaRt)
>> acc <- c("AB002292", "X80391", "Z25470")
>> mart <- useMart("ensembl")
>> mart <-useDataset("hsapiens_gene_ensembl",mart)
>> t <- getBM(attributes=c("chromosome_name", "start_position", "end_position"), values=acc, mart=mart)

You need a filter argument as well. In addition, I usually like to put 
the input argument into the attributes as well, so you can line things 
up if there are certain IDs that don't return a result.

 > getBM( 
c("embl","chromosome_name","start_position","end_position"),"embl", acc, 
mart)
       embl chromosome_name start_position end_position
1 AB002292               8        1759549      1894206
2   X80391              17        3141679      3142644
3   Z25470              18       13815543     13816861

In this case it isn't necessary, but it might be for your whole vector 
of IDs.

Best,

Jim


>> t
> 
> Please kindly advice.
> 
> 
> - Gundala Viswanath
> Jakarta - Indonesia
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662



More information about the Bioconductor mailing list