[BioC] biomaRt getSequence through genomic position

steffen at stat.Berkeley.EDU steffen at stat.Berkeley.EDU
Wed Dec 3 05:18:57 CET 2008


Hi Paul,

To retrieve sequences with biomaRt and mysql=TRUE, the package actually
connects to two BioMarts one is Ensembl and the other is the sequence
BioMart.  However the user only needs to connect to the Ensembl BioMart.
Under the hood getSequence will also connect to the sequence BioMart.  It
looks like it doesn't disconnect and this causes the error when you apply
this in a loop.  I'll try to provide a fix as soon as possible.

Unfortunately it is not possible to retrieve genomic sequences with mysql=F.
We need to discuss with the Ensembl developers and ask them if they could
make this available through their BioMart web service.

Cheers,
Steffen

> Dear Paul,
>
> and what is the output of sessionInfo()?
>
>   bw Wolfgang
>
> Paul Hammer ha scritto:
>> hi all,
>>
>> i try to get sequences via the getSequence function from biomaRt. Exact
>> i would like to have the last 5 bases of an exon and the last 5 bases of
>> the following intron. my approach is following:
>>
>> library(biomaRt)
>> ensembl_rat = useMart("ensembl", dataset="rnorvegicus_gene_ensembl")
>> filter_rat = listFilters(ensembl_rat)
>> rat_exonsLocs = getBM(attributes=c("ensembl_exon_id",
>> "exon_chrom_start", "exon_chrom_end"), filter=filter_rat[c(14,45,12),1],
>> values=list(chromosome="1", status="KNOWN", biotype="protein_coding"),
>> mart=ensembl_rat)
>> laenge = dim(rat_exonsLocs)[1]
>>
>> ensembl_rat2 = useMart("ensembl", dataset="rnorvegicus_gene_ensembl",
>> mysql=TRUE)
>> for(i in 1:laenge){
>> gseqs_exon = getSequence(chromosome = 1, start=rat_exonsLocs[i,3]-5, end
>> = rat_exonsLocs[i,3], mart = ensembl_rat2)
>> seqs_introns = getSequence(chromosome = 1, start=rat_exonsLocs[i+1,2]-5,
>> end=rat_exonsLocs[i+1,2], mart = ensembl_rat2)
>> }
>>
>> but i get always this error message: "Error in mysqlNewConnection(drv,
>> ...) : RS-DBI driver: (??O?cannot allocate a new connection -- maximum
>> of 16 connections already opened)"
>>
>> Is there a way to use useMart without mysql=TRUE to get sequences only
>> via genomic position? when i connect without mysql=TRUE
>> (useMart("ensembl", dataset="rnorvegicus_gene_ensembl") ) i always have
>> to set seqType and type. when i do this i don't get the 5 bases that i
>> want!
>>
>> any help would great!
>> thanks in advance
>> paul
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list