[BioC] Genbank accession annotation?
hrh at fmi.ch
Fri Oct 4 09:26:51 CEST 2013
For this particular EST sequence, you find the annotation in UniGene.
If you have many such EST sequences, I recommend to download the UniGene
and do some horrible parsing (with your favorite parsing language).....
For "R28020", you will get:
TITLE RAB2A, member RAS oncogene family
SEQUENCE ACC=R28020.1; NID=g784155; CLONE=IMAGE:133972; END=3';
On 10/03/2013 10:34 PM, James W. MacDonald wrote:
> Hi Ed,
> Hypothetically you would want to use the org.Hs.eg.db package. However,
> not all GenBank assession numbers will be annotated, presumably because
> they have been retired. Alternately you could use biomaRt as well.
> However, the example ID you give is not annotated by either source.
> On Wednesday, October 02, 2013 4:05:51 PM, Ed Siefker wrote:
>> What package would I need to transform Genbank accession numbers into
>> gene symbols or entrez gene ids? e.g.If I search "R28020" on NCBI, it
>> me that "This EST is one of 1366 sequences matched to RAB2A: RAB2A,
>> member RAS oncogene family. "
>> Is there a metadata package that has this kind of information in it? I
>> have a
>> couple hundred such identifiers that I need to map to genes. I'd like to
>> be able to run
>> getSYMBOL("R28020", "some_annotation_package")
>> and get a useful result. Any ideas?
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> Search the archives:
> James W. MacDonald, M.S.
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives:
More information about the Bioconductor