[BioC] Creating annotation packages

Gábor Csárdi Gabor.Csardi at unil.ch
Thu Mar 5 18:24:03 CET 2009


Marc,

thanks, finally I had time for this project. Things went well, mostly,
apart from some minor tweaks that I needed.

createSimpleBimap() is too simple for me, I have more difficult maps,
so I could not use that, but managed to call back to (internal)
functions of AnnotationDbi instead. I have two questions.

First, I want to define some new classes that extend 'AnnDbBimap', but

      setClass("miRNAAnnDbBimap", contains="AnnDbBimap")
      setClass("miRNATargetAnnDbBimap", contains="AnnDbBimap")

gives me warnings, because 'AnnObjs' are not exported:

Loading required package: DBI
Warning message:
In .findOrCopyClass(class2, classDef2, where, "subclass") :
  Class "AnnObj" is defined (with package slot "AnnotationDbi") but no
metadata object found to revise subclass information---not exported?
Making a copy in package "targetscan.Hs.eg.db"

I don't know much about S4, so maybe I am doing something wrong here.

Second, for a mapping I have lots of meta data, and it is not clear to
me how to define the L2Rchain to get everything right. Right now I am
doing this:

      list(objName="TARGETSFULL",
           Class="miRNATargetAnnDbBimap",
           L2Rchain=list(
             list(tablename="genes",
                  Lcolname="gene_id",
                  Rcolname="_id"
                  ),
             list(tablename="targets",
                  Lcolname="target",
                  Rattribnames=c(
                    UTR_start="{utr_start}",
                    UTR_end="{utr_end}",
                    MSA_start="{msa_start}",
                    MSA_end="{msa_end}",
                    Seed_match="seed_match.name",
                    PCT="{pct}"),
                  Rattrib_join="LEFT JOIN seed_match ON
{seed_match}=seed_match._id LEFT JOIN mirna_family AS _R ON
{family}=_R._id",
                  Rcolname="name"
##                   ),
##              list(tablename="mirna_family",
##                   Lcolname="_id",
##                   Rcolname="name"
                  )
             )
           )

which is quite a hack and I cannot use revmap() on this mapping as a
result. (Maybe there are other deficiencies I failed to notice so
far.) If I use the currently uncommented lines (and remove the second
LEFT JOIN and use Rcolname="family", which is the name of the column),
then I lose all the attributes listed in Rattribnames.

Maybe I am doing something wrong here, I could not find any
documentation on how to write the L2Rchains. Or maybe there is some
createNonSimpleBimap() function that I should use and I just could not
find it.

Apart from these, I am happy with the result. Best Regards,
Gabor

> sessionInfo()
R version 2.8.1 (2008-12-22)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] targetscan.Hs.eg.db_5.0-1 RSQLite_0.7-1
[3] DBI_0.2-4                 AnnotationDbi_1.5.18
[5] Biobase_2.2.1
>

On Fri, Jan 30, 2009 at 7:24 PM, Marc Carlson <mcarlson at fhcrc.org> wrote:
> Hi Gabor,
>
> I would reccomend that you make a package using the SQLForge vignette as
> Sean suggested, then use DBI to add your mappings into it as "stand
> alone" tables so that you can use the createSimpleBimap() to easily make
> AnnDbBimap objects when your package loads.   Please let me know if you
> need further assistance.
>
>
>  Marc
>
>
>
>
> Gábor Csárdi wrote:
>> Sean, looks great, thanks, G.
>>
>> On Fri, Jan 30, 2009 at 5:22 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>>
>>> On Fri, Jan 30, 2009 at 11:13 AM, Gábor Csárdi <Gabor.Csardi at unil.ch> wrote:
>>>
>>>> Dear All,
>>>>
>>>> I am trying to create an annotation package that contains predicted
>>>> miRNA targets, basically it would be a mapping between Entrez Gene IDs
>>>> and miRNA families, for a couple of organisms, together with some
>>>> additional info or course.
>>>>
>>>> I am trying to make use of the AnnBuilder package, but could not find
>>>> out whether it can do this at all or not. Btw. the vignettes of it
>>>> seem to a bit outdated, e.g. the 'writeAnnData2Pkg' function is not
>>>> public any more.
>>>>
>>>> So my questions are:
>>>>
>>>> 1) Is AnnBuilder the right tool for this?
>>>>
>>>> 2) If not, are there any "right" tools? I don't mind creating the
>>>> sqlite database by hand, but how do I create AnnDbBimap objects for
>>>> it?
>>>>
>>>> 3) Are your scripts for building the standard annotation packages
>>>> (e.g. org.xx.eg.db) publicly available somewhere? It would be of great
>>>> help to see how this is done.
>>>>
>>> Hi, Gabor.  Actually, the AnnBuilder way of building things is deprecated.
>>> You will want to use the AnnotationDbi package and refer to the SQLForge
>>> vignette.
>>>
>>> Hope that helps.  If you have more specific questions, be sure to include
>>> sessionInfo() output.
>>>
>>> Sean
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>



-- 
Gabor Csardi <Gabor.Csardi at unil.ch>     UNIL DGM



More information about the Bioconductor mailing list