[BioC] Rsubread crashes in 32bit linux

Wei Shi shi at wehi.EDU.AU
Wed Jun 6 06:49:28 CEST 2012


Dear Robert,

We do not have a 32bit linux machine here, but we managed to reproduce the problem you have encountered using a 32bit Virtual Machine running on a 64bit linux machine. It turned out that the problem occurred when one of our calls to the malloc() function was unsuccessful in requesting memories from the system, which means the system runs out of memory and can not allocate more memories to the buildindex() function.

We tried to let buildindex() function request a small amount of memory (1000MB), which was found to be able to solve this problem. So I recommend you to give value of 1000 to the 'memory' parameter of buildindex() function. The yeast genome is very tiny, so you do not need 2500MB of memory to build the index for it. The buildindex() function requires at least 1000MB of memory no matter how big or small the reference genome is, to build hash tables and remove highly repetitive 16 mers.

Also note that the mapping results are not affected by the amount of memory requested in the index building step. The amount of memory used will only affect the running time. For example, using 8GB of memory to build index for mouse genome will give you a mapping speed twice as fast as that from using 4GB of memory. But for the yeast, the entire index will always be loaded into the memory in one go, because its genome size is tiny and the minimal memory used by buildindex() is 1GB which is big enough to accommodate the hash table, the genome sequences and other related information.

Finally, the reason why the problem you encountered did not happen in version 1.1.1 was because genome sequences were not included in the built index by default in that version, however, they are included in the index in the newer versions.

Hope this can solve your problem. But please let us know if it doesn't.

Cheers,
Wei



On Jun 5, 2012, at 5:45 PM, Robert Castelo wrote:

> hi,
> 
> the computer room at my university where we do practicals on R & Bioconductor runs a 32bit linux distribution and when i tried to run the latest version of the Rsubread package (1.6.3) it crashes when calling the buildindex() function on a multifasta file with the yeast genome. this does *not* happen under a 64bit linux distribution.
> 
> i have verified that installing the version before (1.4.4) on the current R 2.15 it also crashes (on the 32bit), but two versions before, the 1.1.1, it does *not* and it works smoothly on this 32bit linux distribution.
> 
> i'm pasting below the output of using the 1.6.3 and 1.1.1 on R 2.15 where allChr.fa is the multifasta file with the yeast genome.
> 
> so i can manage by now the problem by using the 1.1.1 version on R 2.15 for my teaching but i wonder whether there would be some easy solution for this, or even if it could be a symptom of something else that the Rsubread developers should worry about. i know that using a 32bit system nowadays is quite obsolete but this is what i got for teaching :( and i would be happy to let my students play with the latest version of Rsubread in the future.
> 
> 
> thanks!!!
> robert.
> 
> ======================Rsubread 1.6.3 on R 2.15=======================
> 
>> library(Rsubread)
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: i686-pc-linux-gnu (32-bit)
> 
> locale:
> [1] LC_CTYPE=ca_ES.UTF-8       LC_NUMERIC=C              
> [3] LC_TIME=ca_ES.UTF-8        LC_COLLATE=ca_ES.UTF-8    
> [5] LC_MONETARY=ca_ES.UTF-8    LC_MESSAGES=ca_ES.UTF-8   
> [7] LC_PAPER=C                 LC_NAME=C                 
> [9] LC_ADDRESS=C               LC_TELEPHONE=C            
> [11] LC_MEASUREMENT=ca_ES.UTF-8 LC_IDENTIFICATION=C       
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> other attached packages:
> [1] Rsubread_1.6.3
> 
>> buildindex(basename="subreadindex", reference="allChr.fa", memory=2500)
> 
> Building a base-space index.
> Size of memory used=2500 MB
> Base name of the built index = subreadindex
> 
> *** caught segfault ***
> address 0xdf670cc0, cause 'memory not mapped'
> 
> Traceback:
> 1: .C("R_buildindex_wrapper", argc = as.integer(n), argv = as.character(cmd),     PACKAGE = "Rsubread")
> 2: buildindex(basename = "subreadindex", reference = "allChr.fa",     memory = 2500)
> 
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection: 
> 
> 
> ======================Rsubread 1.1.1 on R 2.15=======================
> 
>> library(Rsubread)
>> buildindex(basename="subreadindex", reference="allChr.fa", memory=2500)
> 
> Building the index in the base space.
> Size of memory requested=2500 MB
> Index base name = subreadindex
> INDEX ITEMS PER PARTITION = 275940352
> 
> completed=40.88%; time used=1.7s; rate=2955.1k bps/s; total=12m bps                          completed=81.76%; time used=2.4s; rate=4111.8k bps/s; total=12m bps                                      
> All the chromosome files are processed.
> | Dumping index [===========================================================>]
> Index subreadindex is successfully built.
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: i686-pc-linux-gnu (32-bit)
> 
> locale:
> [1] LC_CTYPE=ca_ES.UTF-8       LC_NUMERIC=C              
> [3] LC_TIME=ca_ES.UTF-8        LC_COLLATE=ca_ES.UTF-8    
> [5] LC_MONETARY=ca_ES.UTF-8    LC_MESSAGES=ca_ES.UTF-8   
> [7] LC_PAPER=C                 LC_NAME=C                 
> [9] LC_ADDRESS=C               LC_TELEPHONE=C            
> [11] LC_MEASUREMENT=ca_ES.UTF-8 LC_IDENTIFICATION=C       
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> other attached packages:
> [1] Rsubread_1.1.1
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list