[BioC] biomaRt queries: optimal size?

J.delasHeras at ed.ac.uk J.delasHeras at ed.ac.uk
Tue Dec 22 14:28:00 CET 2009


Quoting Wolfgang Huber <whuber at embl.de>:

> Hola José
>
> sorry for the name confusion. The way that BioMart presents many-to-one
> relationships (producing one single big table with all queried
> attributes, and possibly lots of repetitions in some columns) can be
> very space-inefficient. This is the price that that system's design
> pays for the simplicity.
>
> Anyway, I don't think it should return table rows that are completely
> identical -  if you (or someone else here) comes across such an
> instance,  then please report that on this list!
>
> 	Best wishes
> 	Wolfgang

Hi Wolfgang,

no worries (about the name).
Yes, the results table is not the most space-efficient, but it IS  
simple. It's just a matter of knowing the shape teh results will take  
(and now I know) and one can easily write the code accordingly. I  
didn't come across entirely repeated rows, there was always at least  
one difference. I think it works just the way it's supposed to.
I like to process these type of data by merging unique multiple hits  
(GO ids, for isntance) into once cell, maybe separated by a pipe  
character "|". The resulting table is a lot smaller and can still be  
easily searched.

> PS Do you know the way to San ... :)

(sitting at the piano)
no, but if you hum it... ;-)

Jose

-- 
Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
*********************************************
NEW EMAIL from July'09: nach.mcnach at gmail.com
*********************************************

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the Bioconductor mailing list