[BioC] o

John Zhang John Zhang <jzhang@jimmy.harvard.edu>
Thu, 19 Dec 2002 10:54:39 -0500 (EST)


You probably need to parse two files to get the mappings you want. One of the 
file (ftp://ftp.ncbi.nih.gov/refseq/locuslink/LL_tmpl.gz) from LocusLink 
contains mappings from GenBank Accession numbers to GO ids. Another is the file 
from GO (ftp://ftp.geneontology.org/pub/go/xml/go_200211_termdb.xml.gz) that 
contains GO terminologies. 

I would suggest that you create a ";" separated text file with one column for 
GenBank names and another for GenBank Accession Numbers. Then, write a segment 
of Perl script and parse LL_tmpl.gz using AnnBuilder (following the examples in 
the vignette - Basic functions of AnnBuilder). That will give you the mappings 
from GenBank names to GO ids. AnnBuilder already have a parser (GOMXLParser) to 
parser the GO file (Code examples in the vignette - Source specific functions of 
AnnBuilder). What you need to do then is just to map the GO ids to GO terms. 


>To: bioconductor@stat.math.ethz.ch
>From: AlessandroSemeria@cramont.it
>X-MIMETrack: Serialize by Router on ecom/twd(Release 5.0.6a |January 17, 2001) 
at 12/19/2002 09:36:25 AM
>MIME-Version: 1.0
>X-Virus-Scanned: by amavisd-milter (http://amavis.org/)
>X-Virus-Scanned: by amavisd-milter (http://amavis.org/)
>X-Spam-Status: No, hits=2.1 required=5.0 tests=NO_REAL_NAME,SPAM_PHRASE_00_01 
version=2.43
>X-Spam-Level: **
>Subject: [BioC] o
>X-BeenThere: bioconductor@stat.math.ethz.ch
>X-Mailman-Version: 2.0.13
>List-Help: <mailto:bioconductor-request@stat.math.ethz.ch?subject=help>
>List-Post: <mailto:bioconductor@stat.math.ethz.ch>
>List-Subscribe: <http://www.stat.math.ethz.ch/mailman/listinfo/bioconductor>, 
<mailto:bioconductor-request@stat.math.ethz.ch?subject=subscribe>
>List-Id: The Bioconductor Project Mailing List <bioconductor.stat.math.ethz.ch>
>List-Unsubscribe: <http://www.stat.math.ethz.ch/mailman/listinfo/bioconductor>, 
<mailto:bioconductor-request@stat.math.ethz.ch?subject=unsubscribe>
>List-Archive: <http://www.stat.math.ethz.ch/pipermail/bioconductor/>
>Date: Thu, 19 Dec 2002 09:38:13 +0100
>
>Hello!
>I don't know how
>to associate  ontology  to a long list of GeneBank Name (a txt-tab file or
>an XML file given
>by cDNA Agilent Scanner& Sw), i.e.
>I would as output a formatted file with 4 columns (1:GeneBank Name
>2,3,4:ontology).
>I know that I have to perform a mapping of genes, I got  a  look on
>AnnBuilder pkg and annotate pkg,
>but I 've not idea from where to start.
>Some suggestion? Thanks in advance!
>
>A. S.
>
>----------------------------
>
>|------------------------------------+------------------------------------|
>|Alessandro Semeria                  |Tel. +39 544 536811                 |
>|------------------------------------+------------------------------------|
>|Models and Simulation Laboratory    |Fax. +39 544 538663                 |
>|------------------------------------+------------------------------------|
>|The Environment Research Center -   |                                    |
>|Montecatini (Edison Group),    Via  |                                    |
>|Ciro Menotti 48,                    |E-mail: asemeria@cramont.it         |
>|48023 Marina di Ravenna (RA), Italy |                                    |
>|------------------------------------+------------------------------------|
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor@stat.math.ethz.ch
>http://www.stat.math.ethz.ch/mailman/listinfo/bioconductor