[BioC] Fwd: extract introns

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Nov 10 16:29:24 CET 2011


Sorry, forgot to cc bioc-list:

Hi,

On Wed, Nov 9, 2011 at 6:22 AM, Yating Cheng <yating.cheng at charite.de> wrote:
> Dear Bioconductor Memebers,
>
> Now I have to extract intron sequences, I have already exon+intron, intron
> sequences. Someone told me that I can use Biostring. But I tried, it
> failed.
>
> Do you know how to use Biostring to solve this problem, or is there any
> other possibility to solve this problem?

Here's a heavy-handed-hammer you can use:

(1) Store your intronic ranges in a GRanges object, let's call this
GRanges object `intron.ranges`.

(2) Make sure you have have the appropriate BSgenome.* package which
has the genomic sequence for the organism you are working with. For
instance, if you are working with human data with the hg19 release,
you'd need the BSgenome.Hsapiens.UCSC.hg19 package.

Now:

R> library(BSgenome.Hsapiens.UCSC.hg19)

## This will load a `Hsapiens` BSgenome object into your workspace

R> intron.seqs <- getSeq(Hsapiens, intron.ranges, as.character=FALSE)

This may take a while -- and also require a bit of RAM, so hopefully
you're not limited by either.

intorn.seqs will be a DNAStringSet for the introns listed in
`intron.ranges` ... I guess you can take it from there ...

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list