[BioC] Retreving mRNA subsequence (GenomicFeatures, BSgenome)

Lukasz [guest] guest at bioconductor.org
Tue Oct 15 18:36:42 CEST 2013


Problem summary: How to retrieve part of the sequence of mRNA around given location.

I have the locations of the binding to mRNA events as GRanges (GRevents) and need to retrieve sequence for motif finding. The problem is that if I use getSeq(flank(GRevents, width=n)) then I get the genomic sequence not transcript sequence, i.e. not accounting for introns or mRNA border. I have tried solving it with exonsBy("transcriptDb object", "tx") function but without success.

Question: Is there a bioconductor-supported way of getting resolving the problem? With CLIPseq being more and more popular this will be very demanded function.


 -- output of sessionInfo(): 

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rtracklayer_1.16.0                 GenomicFeatures_1.8.1             
 [3] AnnotationDbi_1.18.4               Biobase_2.16.0                    
 [5] BSgenome.Mmusculus.UCSC.mm9_1.3.17 BSgenome_1.24.0                   
 [7] Biostrings_2.24.1                  GenomicRanges_1.8.3               
 [9] IRanges_1.14.2                     BiocGenerics_0.2.0                

loaded via a namespace (and not attached):
 [1] biomaRt_2.12.0  bitops_1.0-4.1  DBI_0.2-5       RCurl_1.91-1   
 [5] Rsamtools_1.8.0 RSQLite_0.11.1  stats4_2.15.0   tools_2.15.0   
 [9] XML_3.9-4       zlibbioc_1.2.0 

Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list