[BioC] Working with non-type strain annotation

Thu May 23 23:48:40 CEST 2013

Hi Thomas,

Have you looked at the makeOrgPackageFromNCBI() function in the 
AnnotationForge package?

library(AnnotationForge)
?makeOrgPackageFromNCBI

It is sometimes useful for cases where you have less common organisms.  
However, in your case it might not work since there is a chance that 
even NCBI may not have annotations available for your organisms.  If 
that is the case, then you would have to do some more custom work 
(depending on what information you actually do have).

   Marc

On 05/16/2013 01:30 AM, Thomas Dybdal Pedersen wrote:
> Hi
>
> I'm doing proteomics on industrial bacterial strains. The genomes of these strains are almost completed (no joining of contigs) and my main genomic data is thus a list of CDS's. I have functionally annotated these using Blast2Go, and have thus GO terms, possibly EC number and Uniprot ID for the closest match for most of the CDS's.
>
> My question is thus: How do I best proceed with this data in the Bioconductor framework, when I want to do things suchs as gene set enrichment analysis etc. Is the best approach to build my own Annotation packages for each strain or is there a simpler 'ad hoc' data structure that supports the same functionality?
>
> It seems that most of the tutorials etc. supposes that you work on type strains (which is also probably true for the most part) where an annotation package is readily available…
>
> best
>
> Thomas
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor