[BioC] GOstats locked database

Janet Young jayoung at fhcrc.org
Thu Jan 24 00:29:39 CET 2008


I have an additional piece of information for this problem. Not sure  
it helps, but thought I'd send it anyway.

My script just crashed again, but this time not while running a  
hyperGTest, although possibly while looking at the results of a  
hyperGTest. It crashed somewhere within the following loop, perhaps  
at the geneIdsByCategory(alltests[[a]],sigCategories(alltests[[a]]))  
line
(I only suspect that line because it uses the hyperGTest result  
stored in alltests[[a]], and it was hyperGTest that was the problem  
every other time, but I could easily be wrong).

 > for (a in 1:4) {
+    if (length(sigCategories(alltests[[a]])) == 0) next
+    genes <- geneIdsByCategory(alltests[[a]],sigCategories(alltests 
[[a]]))
+    for (b in 1:length(genes)) {
+       thesegenes <- genes[[b]]
+       if (length(thesegenes)==0) next
+       for (c in 1:length(thesegenes)) {
+          signifGeneInfo[rowcount,"Test"] <- names(alltests)[a]
+          signifGeneInfo[rowcount,"GO_ID"] <- names(genes)[b]
+          signifGeneInfo[rowcount,"EntrezID"] <- thesegenes[c]
+          signifGeneInfo[rowcount,"GeneName"] <- symbolsfromAnnPkg 
[[ thesegenes[c] ]]
+          signifGeneInfo[rowcount,"Term"] <- Term(get(names(genes) 
[b],GOTERM))
+          rowcount <- rowcount + 1
+       }
+    }
+ }
Error in sqliteFetch(rs, n = -1, ...) :
   RSQLite driver: (RS_SQLite_fetch: failed first step: database is  
locked)
Calls: get ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch -> .Call
Execution halted

On Jan 23, 2008, at 2:29 PM, Janet Young wrote:

> Hi all,
>
> I'm having some trouble with a locked database with GOstats,  
> perhaps due to running multiple simultaneous processes that are all  
> accessing GO.db?
>
> I'm using R CMD BATCH to run an R script I wrote, and I'm doing  
> that simultaneously from 12 different terminal windows, each logged  
> in to a single node of a linux cluster. Some processes may be  
> sharing a node (2 CPU per node). I'm happy to send the entire  
> script, if that's useful, but for now there are just some snippets.  
> Here's the basic problem:
>
> > params <- new("GOHyperGParams", geneIds = geneentrezIDs,  
> universeGeneIds = allgeneentrezIDs, ontology="BP",  
> annotation="org.Hs.eg.db",pvalueCutoff=hgCutoff, conditional=FALSE,  
> testDirection = "over")
> > thishgOver<-hyperGTest(params)
> Error in sqliteFetch(rs, n = -1, ...) :
>   RSQLite driver: (RS_SQLite_fetch: failed first step: database is  
> locked)
> Calls: hyperGTest ... dbGetQuery -> sqliteQuickSQL -> sqliteFetch - 
> > .Call
> Execution halted
>
> It's a very sporadic problem - I'm actually using the script to  
> loop through a bunch of simulated datasets and run hyperGTest - it  
> does fine for a while and then suddenly has a problem. I can't be  
> sure, but it seems like several of the processes I was running  
> simultaneously all had a problem around the same time (which  
> wouldn't be surprising if something suddenly happened to the  
> database).
>
> It's also possible that our linux nodes are having some  
> intermittent connectivity issues to the mounted drives - could that  
> cause the database locked error? If so would there be a way to make  
> hyperGTest robust to a temporary problem like that?
>
> As well as hyperGTest, the script also accesses GO information  
> using the following commands at various points, with commands like  
> these:
> > Term(get(names(genes)[b],GOTERM))
> > geneentrezIDs <- geneentrezIDs[!is.na(mget 
> (geneentrezIDs,envir=org.Hs.egGO,ifnotfound=NA))]
> I was running a very similar version of the script last week, with  
> no problem, and I think the above two commands are the only things  
> I've added that might be accessing the GO data. I'm not clear on  
> which of these commands use the same database as one another: (a)  
> mget from org.Hs.egGO (b) hyperGTest with  
> annotation="org.Hs.eg.db", (c) get from GOTERM.
>
> Here is the output of sessionInfo(), run just before I started  
> looping through the datasets, so several iterations of the mget  
> from org.Hs.egGO and the hyperGTest have happened after running  
> this sessionInfo, but I think all relevant libraries were loaded.  
> (is there a way to make R output sessionInfo immediately before it  
> terminates with error, when running in batch mode?)
>
> > sessionInfo()
> R version 2.6.1 Patched (2007-12-02 r43572)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US 
> .UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US. 
> UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8 
> ;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils      
> datasets
> [8] methods   base
>
> other attached packages:
>  [1] org.Hs.eg.db_2.0.2  GOstats_2.4.0       Category_2.4.0
>  [4] genefilter_1.16.0   survival_2.34       RBGL_1.14.0
>  [7] annotate_1.16.1     xtable_1.5-2        GO.db_2.0.2
> [10] AnnotationDbi_1.0.6 RSQLite_0.6-4       DBI_0.2-4
> [13] Biobase_1.16.2      graph_1.16.1
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.9
>
>
> And here's some other, possibly pertinent information:
> [12] kpvpt50:/home/jayoung/traskdata/janet/forOthers/forIlona/ 
> GOanalysis/doGOmoreregions_slightly_better_again/DCLoss_10percent>   
> ls -l ~/traskdata/lib_linux/R/library/GO.db/extdata/
> total 37364
> -rw-r--r--  1 jayoung trasklab 38252544 Dec  3 13:55 GO.sqlite
> So I can write to GO.sqlite. Should it be read-only, to myself?  
> Will that mess me up if I want to over-write it in future?
> [93] bedrock:/home/jayoung/traskdata/janet/forOthers/forIlona/ 
> GOanalysis/doGOmoreregions_slightly_better_again> ls -l ~/traskdata/ 
> lib_linux/R/library/org.Hs.eg.db/extdata/
> total 187130
> -rw-r--r--   1 jayoung  trasklab 95802368 Dec 13 14:50  
> org.Hs.eg.sqlite
>
>
> Thanks for any advice - this is a tricky one as it happens sometime  
> in the middle of a ~12 hour run, and is not necessarily  
> reproducible. Hopefully I've provided enough information here to  
> track down the problem.
>
> Janet
>
> -------------------------------------------------------------------
>
> Dr. Janet Young (Trask lab)
>
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Avenue N., C3-168,
> P.O. Box 19024, Seattle, WA 98109-1024, USA.
>
> tel: (206) 667 1471 fax: (206) 667 6524
> email: jayoung at fhcrc.org
>
> http://www.fhcrc.org/labs/trask/
>
> -------------------------------------------------------------------
>
>
>



More information about the Bioconductor mailing list