[BioC] AnnBuilder and Kegg

John Zhang jzhang at jimmy.harvard.edu
Wed Nov 22 15:29:18 CET 2006


>After the program finishes I eventually have an annotation package for
>my data but it does not contain any kegg data. 

Look at your code. The organism name is wrong (Mus Musclusus rather than Mus 
musclus). 

>
>When I install my package (outside R, under linux)
>I have this:
>
>*******************************************************************************
***
>R CMD INSTALL lgtc201106
>* Installing *source* package 'lgtc201106' ...
>** R
>** data
>**  moving datasets to lazyload DB
>** help
> >>> Building/Updating help pages for package 'lgtc201106'
>     Formats: text html latex example
>  lgtc201106                        text    html    latex
>  lgtc201106ACCNUM                  text    html    latex   example
>  lgtc201106CHR                     text    html    latex   example
>  lgtc201106ENZYME                  text    html    latex   example
>  lgtc201106GENENAME                text    html    latex   example
>  lgtc201106GO                      text    html    latex   example
>  lgtc201106GO2ALLPROBES            text    html    latex   example
>  lgtc201106GO2PROBE                text    html    latex   example
>  lgtc201106LOCUSID                 text    html    latex   example
>  lgtc201106MAP                     text    html    latex   example
>  lgtc201106OMIM                    text    html    latex   example
>  lgtc201106ORGANISM                text    html    latex   example
>  lgtc201106PATH                    text    html    latex   example
>  lgtc201106PMID                    text    html    latex   example
>  lgtc201106PMID2PROBE              text    html    latex   example
>  lgtc201106QC                      text    html    latex
>  lgtc201106QCDATA                  text    html    latex
>  lgtc201106REFSEQ                  text    html    latex   example
>  lgtc201106SUMFUNC                 text    html    latex   example
>  lgtc201106SYMBOL                  text    html    latex   example
>  lgtc201106UNIGENE                 text    html    latex   example
>** buil0ding package indices ...
>* DONE (lgtc201106)
>*******************************************************************************
**********
>
>and when I call the library in R
>*******************************************************************************
**********
>library(lgtc201106)
>lgtc201106()
>
>
>Quality control information for  lgtc201106
>Date built: Created: Wed Nov 22 13:12:38 2006
>
>Number of probes: 23233
>Probe number missmatch: None
>Probe missmatch: None
>Mappings found for probe based rda files:
>         lgtc201106ACCNUM found 22512 of 23233
>         lgtc201106CHR found 18757 of 23233
>         lgtc201106ENZYME found 0 of 23233
>         lgtc201106GENENAME found 18674 of 23233
>         lgtc201106GO found 0 of 23233
>         lgtc201106GO found 0 of 23233
>         lgtc201106LOCUSID found 18977 of 23233
>         lgtc201106MAP found 15808 of 23233
>         lgtc201106OMIM found 433 of 23233
>         lgtc201106PATH found 0 of 23233
>         lgtc201106PMID found 18967 of 23233
>         lgtc201106REFSEQ found 14098 of 23233
>         lgtc201106SUMFUNC found 0 of 23233
>         lgtc201106SYMBOL found 18977 of 23233
>         lgtc201106UNIGENE found 18149 of 23233
>Mappings found for non-probe based rda files:
>         lgtc201106GO2ALLPROBES found 6994
>         lgtc201106GO2PROBE found 5360
>         lgtc201106ORGANISM found 1
>         lgtc201106PMID2PROBE found 92300
>
>kegg <- as.list(lgtc201106PATH2PROBE)
>Error: object "lgtc201106PATH2PROBE" not found
>Error in as.list(lgtc201106PATH2PROBE) : unable to find the argument 'x' in 
selecting a method for function 'as.list'
>
>
>*******************************************************************************
****************************************
>
>thanks 
>
>P
>
>
>
>-----Original Message-----
>From: John Zhang [mailto:jzhang at jimmy.harvard.edu]
>Sent: Wed 11/22/2006 2:25 PM
>To: jzhang at jimmy.harvard.edu; Pedotti, P. (HKG)
>Cc: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] AnnBuilder and Kegg
> 
>
>>thank you for the suggestions. 
>>However, I downloaded the newest version of AnnBuilder
>>and still I had the same problem in kegg connection.
>
>Have you looked at the built package to see if you get any pathway annotation. 
>The warning messages like:
>
>Failed to get data from URL:
>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00010.gene
>
>just tell you that there are name miss-match in KEGG's data files but the data 
>package should still build.
>
>I will try to write more informative warning messages when I get the chance.
>
>
>
>
>
>
>
>
>
>
>
>
>>
>>******************************************************************************
*
>***********************************
>>
>>sessionInfo()
>>Version 2.3.1 (2006-06-01)
>>i386-pc-linux-gnu
>>
>>attached base packages:
>>[1] "tools"     "methods"   "stats"     "graphics"  "grDevices" "utils"
>>[7] "datasets"  "base"
>>
>>other attached packages:
>>        GO AnnBuilder    RSQLite        DBI   annotate        XML
>>Biobase
>>  "1.12.0"   "1.12.0"    "0.4-1"   "0.1-10"   "1.10.0"   "0.99-7"
>>"1.10.0"
>>
>>mySrcUrls <- c(GO=
>>"http://www.godatabase.org/dev/database/archive/latest/go_2
>>00605-termdb.rdf-xml.gz",KEGG="ftp://ftp.genome.ad.jp/pub/kegg/pathways",YG="f
t
>p 
>://genome-ftp.stanford.edu/pub/yeast/data_download/",HG="ftp://ftp.ncbi.nih.gov
/ 
>pub/HomoloGene/old/hmlg.ftp",EG="ftp://ftp.ncbi.nlm.nih.gov/gene/DATA",IPI="ftp
: 
>//ftp.ebi.ac.uk/pub/databases/IPI/current/",YEAST="ftp://ftp.yeastgenome.org/pu
b 
>/yeast/sequence_similarity/domains/",KEGGGENOME="ftp://ftp.genome.ad.jp/pub/keg
g 
>/tarfiles/genome",PFAM="ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_relea
s 
>e/Pfam-A.full.gz")
>>ppbase<- file.path(.path.package("AnnBuilder"), "data",
>>"lgtc.ids.1.txt")
>>myBaseType="gb"
>>ABPkgBuilder(baseName=ppbase,
>>+                       srcUrls = mySrcUrls,
>>+                       baseMapType = myBaseType,
>>+                       pkgName = "lgtc.221106",
>>+                       pkgPath = '.',
>>+                       organism ="mouse",
>>+                       version ="1.1.0",
>>+                       author = list(author = "Paola Pedotti",
>>+                       maintener ="Paola Pedotti <p.pedotti at lumc.nl>")
>>+                        )
>>
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00010.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00020.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00030.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00031.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00040.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00051.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00052.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00053.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00061.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00062.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00071.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00072.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00100.gene
>>......................
>>
>>
>>******************************************************************************
*
>****************************
>>
>>
>>Do you have other suggestions?
>>
>>thanks
>>
>>Paola
>>
>>
>>On Tue, 2006-11-21 at 12:13 -0500, John Zhang wrote:
>>> >
>>> >Hi everybody, 
>>> >I am trying to annotate my dataset (home spotted array, two colors,
>>> >mice) using AnnBuilder. 
>>> >Every time I run the program the connection with the kegg
>>> >website is not working, so I am able to build the annotation
>>> >package but not for the kegg pathways. Does anybody know how to
>>> >fix this problem or did anybody find a way to by pass it (like
>>> >downloading a list of accession numbers  and corresponding pathways)?
>>> >here my script:
>>> 
>>> I guess the best thing for you to do is to update your R and BioC packages. 
>The 
>>> released version of AnnBuilder is 1.12.0 while you have 1.10.0 on your 
>machine. 
>>> 
>>> 
>>> 
>>> >
>>> 
>>******************************************************************************
*
>>> **********************
>>> >
>>> >library(AnnBuilder)
>>> >#Loading required package: Biobase
>>> >#Loading required package: tools
>>> >#Welcome to Bioconductor
>>> >#         Vignettes contain introductory material.  To view,
>>> >#         simply type: openVignette()
>>> >#         For details on reading vignettes, see
>>> >#         the openVignette help page.
>>> >#Loading required package: annotate
>>> >
>>> >library(GO)
>>> >
>>> >sessionInfo()
>>> >
>>> >#Version 2.3.1 (2006-06-01)
>>> >#i386-pc-linux-gnu
>>> >#
>>> >#attached base packages:
>>> >#[1] "splines"   "tools"     "methods"   "stats"     "graphics"
>>> >#"grDevices"
>>> >#[7] "utils"     "datasets"  "base"
>>> >#
>>> >#other attached packages:
>>> >#  
>>> >#       globaltest               vsn             limma          multtest
>>> >#          "4.2.0"          "1.10.0"           "2.7.3"          "1.10.2"
>>> >#         survival          affydata              affy            affyio
>>> >#           "2.20"           "1.8.0"          "1.10.0"           "1.0.0"
>>> >#           KEGG            GO        AnnBuilder           RSQLite
>>> >#         "1.12.0"          "1.12.0"          "1.10.0"           "0.4-1"
>>> >#              DBI          annotate               XML           Biobase
>>> >#         "0.1-10"          "1.10.0"          "0.99-7"          "1.10.0"
>>> >       
>>> >
>>> >mySrcUrls <- getSrcUrl("all", organism="Mus Musclusus")
>>> >
>>> >base<- file.path(.path.package("AnnBuilder"), "data", "lgtc.ids.1.txt")
>>> >
>>> >myBaseType<- "gbNRef"
>>> >ABPkgBuilder(baseName=base, 
>>> >                      srcUrls = mySrcUrls,
>>> >                      baseMapType = myBaseType, 
>>> >                      pkgName = "lgtc201106",
>>> >                      pkgPath = ".", 
>>> >                      organism ="Mus Musclusus", 
>>> >                      version ="1.1.0", 
>>> >                      author = list(author = "Paola Pedotti", 
>>> >                      maintener ="Paola Pedotti <p.pedotti at lumc.nl>")
>>> >                      ) 
>>> >                       
>>> >                       
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07214.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07215.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07216.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07217.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07218.gene
>>> >#[1] "0 2 2"
>>> >#Warning message:
>>> >#cannot open file
>>> >'/usr/local/lib/R/site-library/AnnBuilder/templates/PKGNAMEGO.1.Rd',
>>> >reason 'No such file or directory'
>>> >#The following data sets have been added to the database and will be
>>> >removed:
>>> ># [1] "./lgtc161106/data/lgtc161106ACCNUM.rda"
>>> ># [2] "./lgtc161106/data/lgtc161106CHR.rda"
>>> ># [3] "./lgtc161106/data/lgtc161106ENZYME.rda"
>>> ># [4] "./lgtc161106/data/lgtc161106GENENAME.rda"
>>> ># [5] "./lgtc161106/data/lgtc161106GO.1.rda"
>>> ># [6] "./lgtc161106/data/lgtc161106GO2ALLPROBES.rda"
>>> ># [7] "./lgtc161106/data/lgtc161106GO2PROBE.rda"
>>> ># [8] "./lgtc161106/data/lgtc161106GO.rda"
>>> ># [9] "./lgtc161106/data/lgtc161106LOCUSID.rda"
>>> >#[10] "./lgtc161106/data/lgtc161106MAPCOUNTS.rda"
>>> >#[11] "./lgtc161106/data/lgtc161106MAP.rda"
>>> >#[12] "./lgtc161106/data/lgtc161106OMIM.rda"
>>> >#[13] "./lgtc161106/data/lgtc161106ORGANISM.rda"
>>> >#[14] "./lgtc161106/data/lgtc161106PATH.rda"
>>> >#[15] "./lgtc161106/data/lgtc161106PMID2PROBE.rda"
>>> >#[16] "./lgtc161106/data/lgtc161106PMID.rda"
>>> >#[17] "./lgtc161106/data/lgtc161106QCDATA.rda"
>>> >#[18] "./lgtc161106/data/lgtc161106QC.rda"
>>> >#[19] "./lgtc161106/data/lgtc161106REFSEQ.rda"
>>> >#[20] "./lgtc161106/data/lgtc161106SUMFUNC.rda"
>>> >#[21] "./lgtc161106/data/lgtc161106SYMBOL.rda"
>>> >#[22] "./lgtc161106/data/lgtc161106UNIGENE.rda"
>>> >#Warning message:
>>> >#Can't
>>> >copy /usr/local/lib/R/site-library/AnnBuilder/templates/PKGNAMEGO.1.Rd
>>> >in: copyTemplates(repList, pattern, pkgName, pkgPath)
>>> >
>>> 
>>******************************************************************************
*
>>> **********************
>>> >
>>> >
>>> >thank you in advance
>>> >
>>> >Paola
>>> >
>>> >
>>> >
>>> >_______________________________________
>>> >Center for Human and Clinical Genetics
>>> >Leiden University Medical Center      
>>> >Postzone: S-04-P, Postbus 9600        
>>> >2300 RC Leiden, The Netherlands 
>>> >Telephone: +31 71 526 9440 
>>> >Fax: +31 71 526 8285
>>> >
>>> >_______________________________________________
>>> >Bioconductor mailing list
>>> >Bioconductor at stat.math.ethz.ch
>>> >https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> >Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> 
>>> Jianhua Zhang
>>> Department of Medical Oncology
>>> Dana-Farber Cancer Institute
>>> 44 Binney Street
>>> Boston, MA 02115-6084
>>>
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>Jianhua Zhang
>Department of Medical Oncology
>Dana-Farber Cancer Institute
>44 Binney Street
>Boston, MA 02115-6084
>
>
>
>	[[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
http://news.gmane.org/gmane.science.biology.informatics.conductor

Jianhua Zhang
Department of Medical Oncology
Dana-Farber Cancer Institute
44 Binney Street
Boston, MA 02115-6084



More information about the Bioconductor mailing list