[BioC] advantages of annotation packages

Martin Morgan mtmorgan at fhcrc.org
Tue May 14 19:47:10 CEST 2013


On 05/14/2013 10:42 AM, Cook, Malcolm wrote:
> this is potentially a very important point
>
> however, the lack of easy install availability of previous versions of bioc packages works against it....

I'm not understanding your comment. Bioc versions are released with specific R 
versions. Install the appropriate R version, and get the corresponding Bioc 
packages via biocLite(). Challenges occur when trying to install old R on new 
hardware (e.g., because the old R doesn't compile with new gcc or new 
libraries), but that's probably not what you mean?

Martin

>
> ~ malcolm_cook at stowers.org
>
> ________________________________________
> From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Marc Carlson [mcarlson at fhcrc.org]
> Sent: Monday, May 13, 2013 3:45 PM
> To: bioconductor at r-project.org
> Subject: Re: [BioC] advantages of annotation packages
>
> Just adding to what Martin already said, it's mostly about making your
> research more easily reproducible by using a consistent and traceable
> source for your information.  This sort of thing is important for doing
> science, where other people will need to reproduce your results exactly.
> If all you had was your own personal data.frame, nobody else can really
> work with that unless you also make it available online etc.  And then
> assuming you can serve it up somewhere in perpetuity, you also have to
> explain exactly how you made it etc. In short, when you went to write
> the methods section for your findings, you would end up making and
> maintaining your own annotation resource and thus reinventing the wheel.
>
> There are other advantages too.  For example, many different kinds of
> annotation data are made into packages together, so you can know which
> version of GO was being used by a large group of people and also which
> entrez gene IDs were considered valid etc.  So things are overall more
> standardized for a given version of bioconductor, which can aid in
> collaborations (since people are basically all working off the same data
> set).
>
>
>     Marc
>
>
>
> On 05/10/2013 07:03 PM, Martin Morgan wrote:
>> On 05/10/2013 01:17 AM, Rameswara Sashi Kiran Challa wrote:
>>> Hi All,
>>>
>>> Could anyone please elucidate advantages of having an Annotation package
>>> for an organism or point me to any documentation that clearly lists
>>> all the
>>> various thoughts behind coming up with an Annotation package.
>>>
>>> Will not having a data frame in R (with rows as genes and columns as
>>> various types of annotations like GO, KEGG, Unigene, etc) suffice?
>>> What are
>>
>> One aspect not mentioned is that one gets to exploit R's packaging
>> system to provide easily distributed and documented versions of the
>> data. Suppose you created the package eight months ago and have
>> forgotten some of the detaiils. Easy, check out the package
>> description and help page. Say you're working with a couple of
>> colleagues, and you've been relatively disciplined about incrementing
>> the annotation package when your data changes (or are using a public
>> Bioc annotation package, with versions strictly tied to R / Bioc
>> releases). Easily spot when unusual results are due to differences in
>> data version (hence the frequent request for the output of
>> 'sessionInfo()' on this mailing list) and adopt / instill 'best
>> practices' that make sure everyone on the team (including yourself,
>> even if your team is only 1) are using the same version.
>>
>> Martin
>>
>>> the advantages of having a AnnodbBimap objects and building a
>>> package? Are
>>> there any technical benefits like faster access of information?
>>>
>>> Thanks for your time,
>>>
>>> -Sashi
>>>
>>>      [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list