[BioC] Why is *ply-ing over a GRangesList much slower than *ply-ing over an IRangesList?

Vincent Carey stvjc at channing.harvard.edu
Fri Oct 15 06:29:24 CEST 2010


in my experience you can load the invalid object, just don't try to
validate or evaluate it before updateObject is run.
if you can't load it could be interesting to know why, so provide more
details if you run into this.

On Fri, Oct 15, 2010 at 12:10 AM, Steve Lianoglou
<mailinglist.honeypot at gmail.com> wrote:
> On Thu, Oct 14, 2010 at 11:07 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>> On 10/14/2010 04:04 PM, Steve Lianoglou wrote:
>>> On Thu, Oct 14, 2010 at 5:55 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>>> <snip>
>>>> As an update, Patrick has improved performance 10x-ish in IRanges
>>>> 1.7.40, still some more to go...
>>>>
>>>>> replicate(5, system.time(lapply(xcripts, length)))
>>>>           [,1]  [,2]  [,3]  [,4]  [,5]
>>>> user.self  0.31 0.317 0.318 0.313 0.328
>>>> sys.self   0.00 0.002 0.000 0.002 0.000
>>>> elapsed    0.31 0.325 0.319 0.317 0.329
>>>> user.child 0.00 0.000 0.000 0.000 0.000
>>>> sys.child  0.00 0.000 0.000 0.000 0.000
>>>>
>>>>> irl <- IRangesList(lapply(xcripts, ranges))
>>>>
>>>>> replicate(5, system.time(lapply(irl, length)))
>>>>            [,1]  [,2]  [,3]  [,4]  [,5]
>>>> user.self  0.032 0.031 0.032 0.031 0.030
>>>> sys.self   0.000 0.000 0.000 0.001 0.001
>>>> elapsed    0.032 0.031 0.032 0.032 0.031
>>>> user.child 0.000 0.000 0.000 0.000 0.000
>>>> sys.child  0.000 0.000 0.000 0.000 0.000
>>>
>>> Awesome!
>>>
>>> Thanks for dumping some brain power into this.
>>>
>>> Out of curiosity: I have several lists of serialized GRanges objects
>>> which I had to regenerate with the introduction of isCircular (or
>>> whatever it was) because of binary incompatibility with old/new
>>> versions of GRanges.
>>>
>>> Do these updates break any binary compatibility or anything? I'm not
>>> complaining, I just want to make sure I avoid updating until I can get
>>> "out of the woods" and find time to regenerate these things ;-).
>>
>> No, the speed-up did not involve changes in class structure.
>
> Nice.
>
>> Have you tried updateObject on your objects?
>
> No (I didn't even know it was there *blush*).
>
> It's not exactly clear to me how I would have done that, though. If I
> remember correctly R was failing inside the load() call, so I didn't
> have a chance to updateObject() anything ... does that make sense?
>
> Imagine I had a file called "genes.rda" which consisted of one object:
> a list of GRanges objects called `genes`.
>
> I thought I was getting an error right after load("genes.rda"). Can I
> suppress validity checks for a minute while a load "genes.rda", then
> `genes <- lapply(genes, updateObject)`, or something?
>
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list