[BioC] Get all 3'utr and 5'utr region from GenomicFeatures

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue May 24 17:19:45 CEST 2011


Hi,

On Tue, May 24, 2011 at 9:18 AM, James W. MacDonald
<jmacdon at med.umich.edu> wrote:
> Hi Fabrice,
>
> On 5/24/2011 4:56 AM, Fabrice Tourre wrote:
>>
>> Dear list,
>>
>> How can I get all 3'utr and 5'utr region from GenomicFeatures of Human?
>> There are fiveUTRsByTranscript, threeUTRsByTranscript methods in
>> GenomicFeatures. But how can I get these regions?
>
> hg19 <- makeFeatureDbFromUCSC(genome = "hg19", table = "refGene")
> utr3 <- threeUTRsByTranscript(ref19)
>
> Which seems pretty obvious to me, given the help pages for these functions.
>
> It would be helpful if you could give us an indication of where you got
> stuck, and what in particular you didn't understand from the help pages, so
> we can improve our documentation.

Perhaps the OP wanted to know how to get the sequences from those
regions, given the results from those functions?

If that's the case, I'd load up the BSgenome.Hsapiens.UCSC.hg19
library, iterate over my results one chromosome at a time, and use a
mix of the Views function over the unmasked chromosome (using ranges
across the 3'utr across the chromosome as the second argument) ...
you'll have to take care of seqs on the reverse strand.

You could also look at the getSeq function.

I'lll leave the actual writing of the code as an exercise to the
reader since I think it's useful to know how Views work and you'll
have to read up on the documentation a bit :-)

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list