[BioC] advantages of annotation packages

Marc Carlson mcarlson at fhcrc.org
Mon May 13 22:45:45 CEST 2013


Just adding to what Martin already said, it's mostly about making your 
research more easily reproducible by using a consistent and traceable 
source for your information.  This sort of thing is important for doing 
science, where other people will need to reproduce your results exactly. 
If all you had was your own personal data.frame, nobody else can really 
work with that unless you also make it available online etc.  And then 
assuming you can serve it up somewhere in perpetuity, you also have to 
explain exactly how you made it etc. In short, when you went to write 
the methods section for your findings, you would end up making and 
maintaining your own annotation resource and thus reinventing the wheel.

There are other advantages too.  For example, many different kinds of 
annotation data are made into packages together, so you can know which 
version of GO was being used by a large group of people and also which 
entrez gene IDs were considered valid etc.  So things are overall more 
standardized for a given version of bioconductor, which can aid in 
collaborations (since people are basically all working off the same data 
set).


   Marc



On 05/10/2013 07:03 PM, Martin Morgan wrote:
> On 05/10/2013 01:17 AM, Rameswara Sashi Kiran Challa wrote:
>> Hi All,
>>
>> Could anyone please elucidate advantages of having an Annotation package
>> for an organism or point me to any documentation that clearly lists 
>> all the
>> various thoughts behind coming up with an Annotation package.
>>
>> Will not having a data frame in R (with rows as genes and columns as
>> various types of annotations like GO, KEGG, Unigene, etc) suffice? 
>> What are
>
> One aspect not mentioned is that one gets to exploit R's packaging 
> system to provide easily distributed and documented versions of the 
> data. Suppose you created the package eight months ago and have 
> forgotten some of the detaiils. Easy, check out the package 
> description and help page. Say you're working with a couple of 
> colleagues, and you've been relatively disciplined about incrementing 
> the annotation package when your data changes (or are using a public 
> Bioc annotation package, with versions strictly tied to R / Bioc 
> releases). Easily spot when unusual results are due to differences in 
> data version (hence the frequent request for the output of 
> 'sessionInfo()' on this mailing list) and adopt / instill 'best 
> practices' that make sure everyone on the team (including yourself, 
> even if your team is only 1) are using the same version.
>
> Martin
>
>> the advantages of having a AnnodbBimap objects and building a 
>> package? Are
>> there any technical benefits like faster access of information?
>>
>> Thanks for your time,
>>
>> -Sashi
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>



More information about the Bioconductor mailing list