[BioC] annotation package and namespace

Thu Feb 14 01:08:25 CET 2008

Hi Georg --

I think there's a 'more correct' way and an 'easy' way (and probably
other ways). I'm mostly but not 100 % sure of the following.

An easy way is to add a function .onLoad to your package (see
?.onLoad). This gets called when you load the package (library(mypkg))
but before the namespace is 'sealed'. In .onLoad you would write code
to load the contents of your environment into the namespace (which at
the time .onLoad is called is topenv(parent.frame())). This happens
every time the package is loaded, reads all the content of your
environment into the namespace, and will be slow if your environment
contents is large.

A 'more correct' way is to create a 'lazy load' data base when the
package is created. This pre-manufactures the namespace, with the
symbols from your environment available but not yet loaded (this is
what AnnBuilder does; you've probably experienced that annotation
packages load quickly but that there is a pause the first time one of
the environments is referenced. This is 'lazy loading', where the
symbols are loaded with the package, but the data referenced by the
symbols are not loaded until first referenced). The best place to
understand how to do this is probably to look at AnnBuilder:::makeLLDB
(and ABPkgBuilder).

Martin

Georg Otto <georg.otto at tuebingen.mpg.de> writes:

> Dear Bioconductors,
>
> I am struggling with the creation of an annotation package and the
> namespace involved. I apologize if this is a trivial question, but I
> am not very familiar with namespace issues.
>
> I created environments containing annotation data using functions I
> wrote for that purpose (for some reasons I do not want to use
> AnnBuilder). This did work, since I can retrieve these data using
>
>> mget(probeids, env=myENV)
>
> As a next step I save the environment as an .rda file, create a
> package and move the data into myPackage/data.
>
> My problem now is: how to get the environment in the package namespace
> upon loading of the package.
>
> For example if I load the annotation package from bioconductor
> (zebrafish) and list the package namespace I get:
>
>> library(zebrafish)
>> ls("package:zebrafish")
>
>  
>  [1] "zebrafish"             "zebrafishACCNUM"       "zebrafishCHR"         
>  [4] "zebrafishENTREZID"     "zebrafishENZYME"       "zebrafishENZYME2PROBE"
>  [7] "zebrafishGENENAME"     "zebrafishGO"           "zebrafishGO2ALLPROBES"
> [10] "zebrafishGO2PROBE"     "zebrafishLOCUSID"      "zebrafishMAP"         
> [13] "zebrafishMAPCOUNTS"    "zebrafishORGANISM"     "zebrafishPATH"        
> [16] "zebrafishPATH2PROBE"   "zebrafishPMID"         "zebrafishPMID2PROBE"  
> [19] "zebrafishQC"           "zebrafishQCDATA"       "zebrafishREFSEQ"      
> [22] "zebrafishSUMFUNC"      "zebrafishSYMBOL"       "zebrafishUNIGENE"     
>
> If I do the same thing with my own package, I get
>
>> library(myPackage)
>> ls("package:myPackage")
> character(0)
>
> What I can do is to load the environment, so it appears in my
> workspace, but that is not what I want:
>
>> data(myENV)
>> ls("package:zebrafishBM")
> character(0)
>> ls()
> [1] "myENV"
>
>
> My question is: How should I construct the package, so the annotation
> environment appears in the package workspace, equivalent to the way it
> works in the bioconductor package.
>
> This is R version 2.6.1 (2007-11-26), x86_64-redhat-linux-gnu
>
> Thanks a lot!
>
> Georg
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793