[BioC] extending GenomicFeatures to makeTranscriptDbFrom other sources (i.e. GFF3)

Cook, Malcolm MEC at stowers.org
Tue Feb 22 19:12:12 CET 2011


Marc,

Thanks for the complete reply..

Since my original post I've explored the code a bit and see the general approach should not be too hard.

I will probably try and contribute makeTranscriptDbFromGFF3  (which  I happend to notice mentioned in the project's TODO file ;).

I have been using the existing ucsc adaptor to excellent effect, however with a slightly different version of fly genome annotation, and I will write makeTranscriptDbFromGFF3 once the rest of my analysis is fully coded when I will want to make sure I am running against identical annotation set as another convergent analysis used.

Cheers,


Malcolm Cook
Stowers Institute for Medical Research -  Bioinformatics
Kansas City, Missouri  USA
 
 

> -----Original Message-----
> From: bioconductor-bounces at r-project.org 
> [mailto:bioconductor-bounces at r-project.org] On Behalf Of Marc Carlson
> Sent: Tuesday, February 22, 2011 12:04 PM
> To: bioconductor at r-project.org
> Subject: Re: [BioC] extending GenomicFeatures to 
> makeTranscriptDbFrom other sources (i.e. GFF3)
> 
> Hi Malcolm,
> 
> We don't yet support GFF in this way.  But you can always use 
> the very general makeTranscriptDb() function.  It takes a lot 
> more arguments, and their values (specifically labeled 
> data.frames) may have to be prepared a bit more, but it 
> should build a database for you just as the helper functions 
> for biomaRt and UCSC will do. 
> 
> So your options appear to be that you could either read in 
> the gff file as a data.frame and then chop it into the bits 
> you need to satisfy the various arguments, or you could opt 
> to directly read in tables/views from your Chado DB and do 
> the same.  Which of these is more appealing will probably 
> depend on your comfort levels with R, SQL and the data 
> source.  But either way I think the answer you need is to look at:
> 
> ?makeTranscriptDb
> 
> If you are feeling enterprising and would like to contribute a
> TranscriptDbFromGFF3() or a TranscriptDbFromChado() helper 
> function we would happily welcome your contribution.  ;)
> 
> 
> 
>   Marc
> 
> 
> 
> On 02/15/2011 09:08 PM, Cook, Malcolm wrote:
> > Hi,
> >
> > GenomicFeatures can create TranscriptDb from ucsc or from 
> BioMart, which I have used to good effect.  Thanks!
> >
> > My aim is now to perform an analysis with a specific old version of 
> > the flybase supplied fly gene annotations (namely, 
> > 
> ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.22_FB200
> > 9_09/gff/)
> >
> > Thus, I would like to have a TranscriptDb populated from 
> this GFF3 (it is also available in Chado, if that helps).
> >
> > Does anyone have already written and willing to share an 
> adaptor for GFF3 (makeTranscriptDbFromGFF3 ??) , or Chado, or 
> suggest another route for me (write my own?).
> >
> >
> > Thanks,
> >
> > Malcolm Cook - Stowers Institue for Medical Research
> >
> >
> >   * http://fb2009_09.flybase.org/
> >   * 
> ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.
> 22_FB2009_09/gff/
> >   * 
> > 
> http://fb2009_09.flybase.org/static_pages/downloads/archivedata3.html
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: 
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 


More information about the Bioconductor mailing list