[BioC] biomaRt getSequence through genomic position

Paul Hammer Paul.Hammer at p-t-p.de
Fri Nov 28 20:23:27 CET 2008


hi all,

i try to get sequences via the getSequence function from biomaRt. Exact 
i would like to have the last 5 bases of an exon and the last 5 bases of 
the following intron. my approach is following:

library(biomaRt)
ensembl_rat = useMart("ensembl", dataset="rnorvegicus_gene_ensembl")
filter_rat = listFilters(ensembl_rat)
rat_exonsLocs = getBM(attributes=c("ensembl_exon_id", 
"exon_chrom_start", "exon_chrom_end"), filter=filter_rat[c(14,45,12),1], 
values=list(chromosome="1", status="KNOWN", biotype="protein_coding"), 
mart=ensembl_rat)
laenge = dim(rat_exonsLocs)[1]

ensembl_rat2 = useMart("ensembl", dataset="rnorvegicus_gene_ensembl", 
mysql=TRUE)
for(i in 1:laenge){
gseqs_exon = getSequence(chromosome = 1, start=rat_exonsLocs[i,3]-5, end 
= rat_exonsLocs[i,3], mart = ensembl_rat2)
seqs_introns = getSequence(chromosome = 1, start=rat_exonsLocs[i+1,2]-5, 
end=rat_exonsLocs[i+1,2], mart = ensembl_rat2)
}

but i get always this error message: "Error in mysqlNewConnection(drv, 
...) : RS-DBI driver: (??O?cannot allocate a new connection -- maximum 
of 16 connections already opened)"

Is there a way to use useMart without mysql=TRUE to get sequences only 
via genomic position? when i connect without mysql=TRUE 
(useMart("ensembl", dataset="rnorvegicus_gene_ensembl") ) i always have 
to set seqType and type. when i do this i don't get the 5 bases that i want!

any help would great!
thanks in advance
paul



More information about the Bioconductor mailing list