[BioC] biomaRt queries: optimal size?
J.delasHeras at ed.ac.uk
J.delasHeras at ed.ac.uk
Tue Dec 22 14:28:00 CET 2009
Quoting Wolfgang Huber <whuber at embl.de>:
> Hola José
>
> sorry for the name confusion. The way that BioMart presents many-to-one
> relationships (producing one single big table with all queried
> attributes, and possibly lots of repetitions in some columns) can be
> very space-inefficient. This is the price that that system's design
> pays for the simplicity.
>
> Anyway, I don't think it should return table rows that are completely
> identical - if you (or someone else here) comes across such an
> instance, then please report that on this list!
>
> Best wishes
> Wolfgang
Hi Wolfgang,
no worries (about the name).
Yes, the results table is not the most space-efficient, but it IS
simple. It's just a matter of knowing the shape teh results will take
(and now I know) and one can easily write the code accordingly. I
didn't come across entirely repeated rows, there was always at least
one difference. I think it works just the way it's supposed to.
I like to process these type of data by merging unique multiple hits
(GO ids, for isntance) into once cell, maybe separated by a pipe
character "|". The resulting table is a lot smaller and can still be
easily searched.
> PS Do you know the way to San ... :)
(sitting at the piano)
no, but if you hum it... ;-)
Jose
--
Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
*********************************************
NEW EMAIL from July'09: nach.mcnach at gmail.com
*********************************************
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the Bioconductor
mailing list