[BioC] GenBank RefSeq conversion
sdavis2 at mail.nih.gov
Fri May 30 15:09:42 CEST 2008
On Fri, May 30, 2008 at 8:53 AM, Eleni Christodoulou
<elenichri at gmail.com> wrote:
> Hello all!
> I was trying to convert RefSeq accession numbers to GenBank accesion numbers
> (or the opposite). I think that there must exist a library that does this
> job automatically...Does anyone know anything relevant to this?
Hi, Eleni. There is no direct relationship between RefSeq and GenBank
numbers. A given RefSeq may or may not be represented by exactly one
GenBank accession. In fact, a RefSeq may not represent any "real"
sequence, but can be a composite of several "real" sequences. As an
example, see here:
It looks like this RefSeq is actually composed of 4 different
sequences from genbank (if I am reading the record correctly).
The only way I know to deal with this (at least in the general case)
is to go through Entrez Gene (or the Ensembl equivalent of a gene) to
find those accessions in GenBank and RefSeq that share a common Gene
ID. You can do this using the annotation package for the organism of
interest, I think. Steffen or others might be able to comment on how
to do this using biomaRt.
More information about the Bioconductor