[BioC] intragenic sequence extraction

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Nov 29 14:27:41 CET 2011


On Tue, Nov 29, 2011 at 6:25 AM, Yating Cheng <yating.cheng at charite.de> wrote:
> Dear All,
> Does anyone know that how to extract intragenic sequences from the genome.
> Like in the genescan, it is mentioned that the predictions are based on
> transcriptional, translational and donor/acceptor splicing signals as well
> as the length and compositional distributions of exons, introns and
> intergenic regions.
> But I am not sure which function I should use.

Use the GenomicFeatures package to build a gene annotation database
for your organism. By getting a bit more intimate with that package
(good place to start is by reading its vignette, along with that for
GenomicRanges) you will figure out how to identify where the
intergenic regions lie on your genome.

Once you have the intergenic regions of your genome stored in a
GRanges object, its easy to use those ranges to get the sequence from
your genome using the appropriate BSgenome.* package for your
organism. The Biostrings::getSeq function could do that for you, among
other things.

> Thank you very much.
> Yating Cheng
> Molecular Medicine Master's Program
> Charité-Universitätsmedizin Berlin
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the Bioconductor mailing list