[BioC] R crashes with GEOmetadb

Hooiveld, Guido Guido.Hooiveld at wur.nl
Thu Jun 30 14:50:06 CEST 2011


Hi Sean,
Indeed, you are correct! 
Due to my inexperience with performing database queries, and clumsy interpretation of some example code I inadvertently closed the connection to the database... Well, after omitting this line the example is working fine now! :)

One thing though,  through GEOmetadb I locate 17751 CEL files for GPL96, whereas a query directly @ GEO indicates it hosts a considerably larger number of these arrays (i.e. Samples (28011)). Any idea what may cause this discrepancy?
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL96

Thanks again for your assistance,
Guido

-----Original Message-----
From: seandavi at gmail.com [mailto:seandavi at gmail.com] On Behalf Of Sean Davis
Sent: Thursday, June 30, 2011 14:03
To: Hooiveld, Guido
Cc: bioconductor (bioconductor at stat.math.ethz.ch); Seth Falcon
Subject: Re: [BioC] R crashes with GEOmetadb

See below.

On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <Guido.Hooiveld at wur.nl> wrote:
> Dear Sean and others,
>
> I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform.
> To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette).
> However, when executing that code R crashes and needs to exit...
> To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated.
>
> Regards,
> Guido
>
>
> R version 2.13.0 (2011-04-13)
> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 
> 3-900051-07-0
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>>
>> library(GEOmetadb)
> Loading required package: GEOquery
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>  Vignettes contain introductory material. To view, type
>  'browseVignettes()'. To cite Bioconductor, see
>  'citation("Biobase")' and for packages 'citation("pkgname")'.
>
> Setting options('download.file.method.GEOquery'='curl')
> Loading required package: RSQLite
> Loading required package: DBI
>> getSQLiteFile()
> trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz'
> Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes 
> (104.4 Mb) opened URL ================================================
> downloaded 104.4 Mb
>
> Unzipping...
> Metadata associate with downloaded file:
>                name               value
> 1     schema version                 1.0
> 2 creation timestamp 2011-06-18 09:50:00 [1] 
> "/home.local/guidoh/GEOmetadb.sqlite"
>>
>> con <- dbConnect(SQLite(), "GEOmetadb.sqlite")
>> dbDisconnect(con)

Sorry, Guido.  I missed this point in my first pass through your email.  Here, you disconnect the connection.

> [1] TRUE
>>
>> rs <- dbGetQuery(con,paste("select gsm,supplementary_file",
> +                            "from gsm where gpl='GPL96'",
> +                            "and supplementary_file like '%CEL.gz'"))

Here, you are using a disconnected connection object (con) to perform the query; it should fail with an error message but probably not a segmentation fault.  If you DO NOT disconnect the connection object, this query works fine.  Perhaps RSQLite should have a check of the connection object to make sure that it is connected to avoid the segmentation fault?

Sean


> sessionInfo()
R version 2.13.0 Under development (unstable) (2011-02-26 r54608)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] RSQLite_0.9-4 DBI_0.2-5


> *** caught segfault ***
> address 0x8, cause 'memory not mapped'
>
> Traceback:
> 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE = 
> .SQLitePkgName)
> 2: sqliteExecStatement(con, statement, bind.data)
> 3: sqliteQuickSQL(conn, statement, ...)
> 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm 
> where gpl='GPL96'",     "and supplementary_file like '%CEL.gz'"))
> 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm 
> where gpl='GPL96'",     "and supplementary_file like '%CEL.gz'"))
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection: dim(rs)
> Selection:
>
>
> ---------------------------------------------------------
> Guido Hooiveld, PhD
> Nutrition, Metabolism & Genomics Group Division of Human Nutrition 
> Wageningen University Biotechnion, Bomenweg 2
> NL-6703 HD Wageningen
> the Netherlands
> tel: (+)31 317 485788
> fax: (+)31 317 483342
> email:      guido.hooiveld at wur.nl
> internet:   http://nutrigene.4t.com
> http://www.researcherid.com/rid/F-4912-2010
>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list