[BioC] biomaRt queries: optimal size?

J.delasHeras at ed.ac.uk J.delasHeras at ed.ac.uk
Mon Dec 21 19:03:31 CET 2009


Quoting Wolfgang Huber <whuber at embl.de>:

>
> Dear Javier
>
> Try there:
>
> 1. Set
> 	options(error=recover)
> and then use the 'post mortem' debugger to see why postRes (a character
> string) is so large. Let us know what you find!
>
> 2. Rather than splitting up the query genes, you could split up the
> attributes, and only ask for a few at a time, and/or see which one
> causes the large size of the result
>
> 3. Send us a reproducible example (i.e. one that others can reproduce
> by copy-pasting from your email).
>
> 	Best wishes
> 	Wolfgang


"My name is not Javier!!!"

(you had to be in Spain in the 80s to get the joke... nevermind, it  
was a silly pop song ;-)

Thank you for the suggestions. I managed to finish what I was doing  
(breaking the query into chunks of 200ids at a time) but I have some  
more searches coming and will definitely use a different approach, and  
try the options(error=recover) method to investigate if I have problems.

My query, as you suggest above, would be better performed by using  
less attributes, rather than splitting the ids. I just didn't have  
enough experience in this. When using multiple attributes, the  
resulting data frame may contain quite a few more rows of data, if  
there are multiple values for some of teh attributes... and this  
happens a lot when looking at gene ontologies.
I may have started with a 1545 id vector, but ended up with a data  
frame containing nearly 4 million rows! (assembled from 8 individual  
queries of ~200 ids at a time) I will definitely not do it again this  
way!
Much better to pick less attributes and then process the data, and  
then I'll probably be able to process all IDs at once.

Thank you for your help, Wolfgang and Jim.

Jose

-- 
Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
*********************************************
NEW EMAIL from July'09: nach.mcnach at gmail.com
*********************************************


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the Bioconductor mailing list