[BioC] mm10 goseq supported genome

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Jan 30 12:27:23 CET 2013


Hi,

On Wed, Jan 30, 2013 at 4:10 AM, Vincenzo Capece <vivo0304 at gmail.com> wrote:
> Dear all,
> I am trying to analyze RNA seq data with goseq but I have a problem:
> mm10(the reference genome that I am using for my experiment) is not
> supported by goseq. The last version supported by goseq is mm9.
> Can you help me to annotate mm10 RNA seq data with goseq?
> The version of my goseq is: 1.8.0.
> Thanks in advance.

You "just" need to annotate your read count values with the length of
the transcript it is coming from, right?

This can easily be done with a GenomicFeatures database built from the
annotation source you used to tally the reads/gene(or transcript).
There are ones already built for you here:

http://bioconductor.org/packages/release/BiocViews.html#___AnnotationData

Namely the following packages:

* TxDb.Mmusculus.UCSC.mm10.ensGene
* TxDb.Mmusculus.UCSC.mm10.knownGene

Or you can build your own if you didn't use the ensemble of ucsc annotations.

If you're not already familiar with the functionality provided in the
GenomicFeatures package, I'd encourage you to read through its
documentation (vignettes) -- it's extremely helpful. Once you do so,
you'll find it very easy to calculate the width of your transcripts
(or portions of them) as you see fit.

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list