[BioC] biomaRt:getBM error when query is large

steffen at stat.Berkeley.EDU steffen at stat.Berkeley.EDU
Sat Aug 2 00:09:40 CEST 2008


Hi Tao,

I haven't hit a limit yet but you might have.  430.000 ids is quite large.

Try to split your query in a few batches of e.g. 100.000 or 50.000 long
(you should not need to go below this length).

I would also put

Sys.sleep(1)

between each query so you won't get into trouble of sending a subsequent
querying the server to fast after an earlier query.

I bet:

tmp1 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[1:100000], mart = mart)
Sys.sleep(1)
tmp2 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[100000:200000], mart = mart)
Sys.sleep(1)
tmp3 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[200000:300000], mart = mart)
Sys.sleep(1)
tmp4 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[300000:430000], mart = mart)

all = rbind(tmp1,tmp2,tmp3,tmp4)

Should do it.

Cheers,
Steffen


> Hi list,
>
> See the sample codes below, where "rs" is a char vector containing ~430000
> rs IDs.  However, when I ran the query 10000 at a time, it worked.  Is
> there a query limit for biomaRt?
>
> Thanks,
>
> ...Tao
>
>
>
>> tmp <- getBM(c("ensembl_gene_stable_id", "refsnp_id", "allele",
>> "chr_name", "chrom_start", "chrom_strand"),
> +                     filters = "refsnp", values = rs, mart = mart)
> Error in postForm(paste(martHost(mart), "?", sep = ""), query = xmlQuery)
> :
>   Empty reply from server
>
>> sessionInfo()
> R version 2.7.0 (2008-04-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
> [1] biomaRt_1.14.0      RCurl_0.9-3         GO.db_2.2.0
> AnnotationDbi_1.2.2 RSQLite_0.6-9       DBI_0.2-4           Biobase_2.0.1
>
> loaded via a namespace (and not attached):
> [1] XML_1.95-2
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list