[BioC] extracting sequence from a genome

Iain Gallagher iaingallagher at btopenworld.com
Thu Mar 15 16:23:10 CET 2012



Hello List

I have a dataframe of miRNA genomic positions and I would like to get sequence for 200bp upstream of each microRNA. 

library(BSgenome.Rnorvegicus.UCSC.rn4)# get genome

ftpAddr <- "ftp://mirbase.org/pub/mirbase/CURRENT/genomes/rno.gff" # get miR coords
mirInfo <- read.table(ftpAddr) # as dataframe

seqs <- list() # holder

for (i in 1:nrow(mirInfo)){
seq <- getSeq(Rnorvegicus, paste('chr', mirInfo[i,1], sep=''), start = mirInfo[i,4], end = mirInfo[i,4]+200)
seqs <- c(seqs, seq)
}

This works but seems to be pretty inefficient in terms of computing power as my pc locks up during the loop.

Could someone point me to a better way?

Thanks

iain



More information about the Bioconductor mailing list