[BioC] Why is *ply-ing over a GRangesList much slower than *ply-ing over an IRangesList?

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Oct 15 07:09:52 CEST 2010


On Fri, Oct 15, 2010 at 12:29 AM, Vincent Carey
<stvjc at channing.harvard.edu> wrote:
> in my experience you can load the invalid object, just don't try to
> validate or evaluate it before updateObject is run.
> if you can't load it could be interesting to know why, so provide more
> details if you run into this.

No, you're right.

I dug up an old such list-of-GRanges object and was able to
essentially `updated <- lapply(old.list, updateObject)` it into shape.

Sorry for the confusion.

-steve

>
> On Fri, Oct 15, 2010 at 12:10 AM, Steve Lianoglou
> <mailinglist.honeypot at gmail.com> wrote:
>> On Thu, Oct 14, 2010 at 11:07 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>>> On 10/14/2010 04:04 PM, Steve Lianoglou wrote:
>>>> On Thu, Oct 14, 2010 at 5:55 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>>>> <snip>
>>>>> As an update, Patrick has improved performance 10x-ish in IRanges
>>>>> 1.7.40, still some more to go...
>>>>>
>>>>>> replicate(5, system.time(lapply(xcripts, length)))
>>>>>           [,1]  [,2]  [,3]  [,4]  [,5]
>>>>> user.self  0.31 0.317 0.318 0.313 0.328
>>>>> sys.self   0.00 0.002 0.000 0.002 0.000
>>>>> elapsed    0.31 0.325 0.319 0.317 0.329
>>>>> user.child 0.00 0.000 0.000 0.000 0.000
>>>>> sys.child  0.00 0.000 0.000 0.000 0.000
>>>>>
>>>>>> irl <- IRangesList(lapply(xcripts, ranges))
>>>>>
>>>>>> replicate(5, system.time(lapply(irl, length)))
>>>>>            [,1]  [,2]  [,3]  [,4]  [,5]
>>>>> user.self  0.032 0.031 0.032 0.031 0.030
>>>>> sys.self   0.000 0.000 0.000 0.001 0.001
>>>>> elapsed    0.032 0.031 0.032 0.032 0.031
>>>>> user.child 0.000 0.000 0.000 0.000 0.000
>>>>> sys.child  0.000 0.000 0.000 0.000 0.000
>>>>
>>>> Awesome!
>>>>
>>>> Thanks for dumping some brain power into this.
>>>>
>>>> Out of curiosity: I have several lists of serialized GRanges objects
>>>> which I had to regenerate with the introduction of isCircular (or
>>>> whatever it was) because of binary incompatibility with old/new
>>>> versions of GRanges.
>>>>
>>>> Do these updates break any binary compatibility or anything? I'm not
>>>> complaining, I just want to make sure I avoid updating until I can get
>>>> "out of the woods" and find time to regenerate these things ;-).
>>>
>>> No, the speed-up did not involve changes in class structure.
>>
>> Nice.
>>
>>> Have you tried updateObject on your objects?
>>
>> No (I didn't even know it was there *blush*).
>>
>> It's not exactly clear to me how I would have done that, though. If I
>> remember correctly R was failing inside the load() call, so I didn't
>> have a chance to updateObject() anything ... does that make sense?
>>
>> Imagine I had a file called "genes.rda" which consisted of one object:
>> a list of GRanges objects called `genes`.
>>
>> I thought I was getting an error right after load("genes.rda"). Can I
>> suppress validity checks for a minute while a load "genes.rda", then
>> `genes <- lapply(genes, updateObject)`, or something?
>>
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>  | Memorial Sloan-Kettering Cancer Center
>>  | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list