[BioC] Why is *ply-ing over a GRangesList much slower than *ply-ing over an IRangesList?

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Aug 25 20:59:52 CEST 2010


Hi,

On Wed, Aug 25, 2010 at 1:58 PM, Patrick Aboyoun <paboyoun at fhcrc.org> wrote:
> Steve,
> I haven't profiled the code yet to know what is going on, but I will address
> your followup question.
>
> I have a feeling that the GRangesList concept will be growing over time and
> I am not sure what the tipping point will be for changes in code to occur. I
> see two issues related to GRangesList. The first being its internal storage
> (as you mentioned)

Yeah. That's a +1 vote on addressing that at some point from me :-)

> and the second being its semantics (are the
> ranges/intervals contained within each of the elements "grouped" as exons
> within a transcript or are the ranges/intervals considered to be independent
> entities as collections of tracks for a genome browser).

I'm not sure ... my first reaction is to think that one would consider
each element in a GRangesList to be grouped "in some way" (like exons,
as you mention). I would think to model separate tracks as separate
GRangesList, not seperate elements of a *Ranges object in an
*RangesList.

I actually can't think of a scenario where I would want the fire-power
of *RangesList objects (primarily fast overlap and set-like queries)
to address your 2nd scenario (different tracks) where I can easily
appreciate (more and more) where considering each element in the
*RangesList as being grouped ... and if I don't want the elements to
be grouped at all, I'd just unlist() it ...

If elements weren't "grouped" I think I'd probably only ever want to
iterate over each *RangesList element and do set operations w/in those
(maybe overlap TF binding (one IRange) with some acetylation track
(another IRange)) -- but again, haven't thought about it in this way
before ...

I'm not even sure this was 2 cents worth, but there you have it ...

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list