[BioC] Building AnnotationDbi Packages - Problems when using makeAnnDbPkg

Marc Carlson mcarlson at fhcrc.org
Tue Jun 1 22:45:23 CEST 2010


Hi Julian,

The 1st thing to note is that you cannot actually produce a version 1.0
schema at this time.  You can only make a 2.1 version schema.  The older
schemas are unsupported, and only the files remain for people who might
encounter an older database.

To answer your question about ACCNUM, this is a mapping that is normally
used to store the ID that was originally used to tie the probe to the
entrez gene ID.  As a table it is frequently empty, but the table is
still expected to be there, along with the map_metadata about it.   And
@ACCNUMSOURCE@ is just used here to template data from the map_metadata
table about the expected ACCNUM mapping (even if the table is empty) and
put that into the templates for the manual pages that get generated.

So the problem that you are having derives from the fact that you are
using the makeAnnDbPkg() on a database that was not built using the
companion function populateDB().  makeAnnDbPkg() expects certain things
from your database that are just not present from the way you have
produced it.  If you want to hack our system to build a custom package
anyways,  you might have better luck by doing so after you have the
package built by just removing manual pages and dropping tables that you
no longer need etc.  OR, if you have different/other data that you want
to add to the database (from alternate sources etc.), then maybe you
want to just put some placeholder values into the map_metadata table so
that the manual pages can be auto-generated, (and then drop those manual
pages afterwards)?

Please let me know if this helps, I can't test-run your specific
examples because I don't have your actual data. :(


  Marc




On 05/26/2010 09:12 PM, Julian Lee wrote:
> hi all,
>
> I'm building a custom annotation service, and have encountered the following
> problems.
>
> Upon Reading the SQLForge.pdf vignette, i decided that i would not choose to
> use these functions
>
> popHUMANCHIPDB
> populateDB
>
> but instead build my own SQLite Database following the Database Schema
> Version 1.0 described in
>
> AnnotationDbi/inst/DBschemas/schemas_1.0/HUMANCHIP_DB.sql
>
> The SQLite Database was initiated by the following R code
>
> *library(AnnotationDbi)
> library(RSQLite)
> filename<-'myCustomChip.sqlite'
>
> drv<-dbDriver('SQLite')
> db<-dbConnect(drv,dbname=filename)
>
>
> ##Create SQLite Database
> create.sql <- readLines('HUMANCHIP_DB.sql')
> create.sql <- paste(collapse="\n", create.sql)
> create.sql <- strsplit(create.sql, ";")[[1]]
> create.sql <- create.sql[-length(create.sql)]
> database <- sapply(create.sql, function(x) sqliteQuickSQL(db, x))*
>
> Subsequently, the following tables were populated according to the rules
> stated
>
> Genes
> Probes
> Alias
> Ensembl
> Chromosomes
> Chromosomes_Locations
> Cytogenetic_Locations
> Gene_Info
> Refseq
> Unigene
>
> MetaData Tables were populated as follows. Other tables, go, ec, etc were
> left empty
>
> *metadata<-rbind(c("DBSCHEMA", "HUMANCHIP_DB"),
>                 c("ORGANISM", "Homo sapiens"),
>                 c("SPECIES",  "Human"),
>                 c("DBSCHEMAVERSON", "1.0"),
>                 c("MANUFACTURER","AFFYMETRIX"),
>                 c("CHIPNAME","HG-U133_Plus_2"),
>                 c("MANUFACTURERURL","http://www.affymetrix.com"))
>
>
> q11<- paste(sep="", 'INSERT INTO "metadata" VALUES("', metadata[,1],
>             '", "', metadata[,2],  '");')
>
> database<- sapply(q11, function(x) sqliteQuickSQL(db,x))
>
> map.counts<-rbind(c("GENES", nrow(idtable)),
>                   c("PROBES", nrow(finalprobes)),
>                   c("ALIAS",  nrow(alias)),
>                   c("ENSEMBL", nrow(ensembl)),
>                   c("CHROMOSOMES", nrow(chromosomes)),
>                   c("CHROMOSOME_LOCATIONS", nrow(chromosome_location)),
>                   c("CYTOGENETIC_LOCATIONS", nrow(cytoband)),
>                   c("GENE_INFO", nrow(gene_info)),
>                   c("REFSEQ", nrow(refseq)),
>                   c("UNIGENE", nrow(unigene)))
>
> q12<- paste(sep="", 'INSERT INTO "map_counts" VALUES("', map.counts[,1],
>             '",' , map.counts[,2],  ');') *
>
> *database<- sapply(q12, function(x) sqliteQuickSQL(db,x)) *
>
> *map.metadata<-rbind(c("GENES", "XXXX", "TST612", "26 MAY 2010"),
>                   c("PROBES", "XXXX", "TST612", "26 MAY 2010"),
>                   c("ALIAS",  "XXXX", "TST612", "26 MAY 2010"),
>                   c("ENSEMBL", "XXXX", "TST612", "26 MAY 2010"),
>                   c("CHROMOSOMES", "XXXX", "TST612", "26 MAY 2010"),
>                   c("CHROMOSOME_LOCATIONS", "XXXX", "TST612", "26 MAY
> 2010"),
>                   c("CYTOGENETIC_LOCATIONS", "XXXX", "TST612", "26 MAY
> 2010"),
>                   c("GENE_INFO", "XXXX", "TST612", "26 MAY 2010"),
>                   c("REFSEQ", "XXXX", "TST612", "26 MAY 2010"),
>                   c("UNIGENE", "XXXX", "TST612", "26 MAY 2010"))*
> *
> q13<- paste(sep="", 'INSERT INTO "map_metadata" VALUES("', map.metadata[,1],
>             '", "', map.metadata[,2], '","', map.metadata[,3],  '","',
> map.metadata[,4], '");')
>
> database <- sapply(q13, function(x) sqliteQuickSQL(db,x))*
>
> Upon completion of populating the SQLite Database, i then proceeded to build
> my custom annotationDBI.db package.
>
> *seed <- new("AnnDbPkgSeed",
>             Package = "myCustomChip.db",
>             Version = "1.0-0",
>             PkgTemplate = "HUMANCHIP.DB",
>             AnnObjPrefix = "myCustomChip",
>             Title = "MyCustom Annotation for HG-U133PLUS2",
>             Author = "Annotation Services",
>             Maintainer = "Julian Lee <julianlhe at gmail.com>",
>             organism = "Homo sapiens",
>             species = "Human",
>             biocViews = "AnnotationData, FunctionalAnnotation",
>             DBschema = "HUMANCHIP_DB",
>             )
>
>
> system("rm -rf myCustomChip.db")
>
>
> makeAnnDbPkg(seed, filename, dest.dir=getwd(),no.man=F)*
>
> and got the following errors
>
> *Error in cpSubsCon(src[k], destname) :
>  trying to replace @ACCNUMSOURCE@ by an NA*
>
>
> I'm trying to figure out where @ACCNUMSOURCE@ is used, and i can't quite
> find it. There's some mention of it in the man *.Rd pages, but can't quite
> get to it.
> Much help in getting to the bottom of this would help.
>
> Thank you very much
>
> Julian Lee
>
> *> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-unknown-linux-gnu
>
> locale:
>  [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US
>  [4] LC_COLLATE=en_US     LC_MONETARY=C        LC_MESSAGES=en_US
>  [7] LC_PAPER=en_US       LC_NAME=C            LC_ADDRESS=C
> [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] RSQLite_0.8-4       DBI_0.2-5           AnnotationDbi_1.8.2
> [4] Biobase_2.6.1
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.1
> *
>
>
>
>
>
>
>
>
>
>
> *Acknowledgements for some scripts to get me started*
> Computational Biology Group
> Department of Medical Genetics <http://www.unil.ch/dgm>, University of
> Lausanne <http://www.unil.ch/>
>
> http://www2.unil.ch/cbg/index.php?title=Building_BioConductor_Annotation_Packages
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list