[BioC] Create transcriptDb using gff3 files? - library GenomicFeatures and rtracklayer

Sang Chul Choi schoi at cornell.edu
Thu Apr 5 16:57:37 CEST 2012


Speaking of typical gff files, I am wondering if gff files in bacterial genomes available at ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria

for example,
ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Streptococcus_mutans_UA159_uid57947/NC_004350.gff

could be typical gff in bacterial genomes.  Although eukaryotes are more complex than prokaryotes, I just wish that those bacterial genomes gff could be a typical gff.

Out of curiosity, are there Bioconductor R packages of extracting bacterial genome annotation (e.g., genes and positions) to create TranscriptDb object? I think that creating AnnotationDb for a not-well-known bacterial species seems to be overkill.

Thank you,

SangChul

On Apr 4, 2012, at 8:44 PM, Marc Carlson wrote:

> I was looking at this during the course, and this is on my TODO list for the next release cycle.  I think it is long overdue and I don't think that the community is going to get it done in spite of all the enthusiasm.  There has not been time to do it before now but I am hoping that will now change.  It should be simple enough in principle, but it might not be exactly trivial as I have discovered (on closer inspection) that the gff specification is not as concrete as one would like it to be.  Also there have been several different versions.
> 
> Some things that can help speed me along:
> 
> 1) which version is most important?  gff3?  Or one of the other versions?  It is likely that with the older versions we may not be able to extract as much meaningful information.
> 
> 2) where is the best place to find some typical gff3 files for examples?  This should not be difficult, but when I was looking before I was finding that people were surprisingly stingy about sharing these.
> 
> 
>  Marc
> 
> 
> 
> On 04/03/2012 03:57 PM, Michael Lawrence wrote:
>> Marc was working on this during the course in Feb. Not sure what happened
>> to it. He said it was simple. Maybe just waiting for the release to pass.
>> 
>> Michael
>> 
>> On Tue, Apr 3, 2012 at 3:40 PM, Steve Lianoglou<
>> mailinglist.honeypot at gmail.com>  wrote:
>> 
>>> Hi,
>>> 
>>> On Tue, Apr 3, 2012 at 4:41 PM, Sang Chul Choi<schoi at cornell.edu>  wrote:
>>>> Hi,
>>>> 
>>>> I am wondering if I could create a TranscriptDb object (library
>>> GenomicFeatures) using a gff3 file.  I could read a gff3 file using
>>> import.gff3, but I could not find a way to create TranscriptDb object from
>>> the object from import.gff3.
>>>> Two arguments for makeTranscriptDb are required: transcripts, splicings.
>>> It does not seem to be easy to parse this information from the object form
>>> import.gff3.  I will appreciate any help.
>>> 
>>> As far as I know, this functionality isn't there yet ...
>>> 
>>> I once (early feb, 2012) suggested I might take a crack at making this
>>> happen but haven't actually found the time to do it ... I'm not sure
>>> anyone in bioc-core land (hi, Marc) has found the time to do it
>>> either, so I think you're out of luck.
>>> 
>>> Sorry for that. But the good news is that I bet a patch that does this
>>> would be welcome ;-)
>>> 
>>> -steve
>>> 
>>> --
>>> Steve Lianoglou
>>> Graduate Student: Computational Systems Biology
>>>  | Memorial Sloan-Kettering Cancer Center
>>>  | Weill Medical College of Cornell University
>>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> 
>> 	[[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list