[BioC] slow insertions in to graphNEL object (24 hours for 16k nodes)

Seth Falcon sfalcon at fhcrc.org
Wed Sep 12 17:09:30 CEST 2007


Hi again,

Seth Falcon <sfalcon at FHCRC.ORG> writes:
> Paul Shannon <pshannon at systemsbiology.org> writes:
>> It took nearly 24 hours (!) to create a 16k node graph using two  
>> different techniques:
>>
>>     g = fromGXL (file ('someFile.gxl'))

Using a patched version of fromGXL (on my laptop) I get:

    > library(graph)
    > con = file("kegg.yeast.gxl", open="r")
    > system.time(z <- fromGXL(con))
       user  system elapsed 
    104.366   0.570 105.070 
    > z
    A graphNEL graph with undirected edges
    Number of Nodes = 15158 
    Number of Edges = 32668 
    > validObject(z)
    [1] TRUE

That's over 800x faster :-)

I've checked in the changes in devel as graph 1.15.17.  You can get it
from svn or wait till this time tomorrow (in either case, you will
need R 2.6.0 alpha).

The code passes the existing unit tests, but some extra scrutiny that
the returned graphs are as desired is quite welcome.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
BioC: http://bioconductor.org/
Blog: http://userprimary.net/user/



More information about the Bioconductor mailing list