[R] R crash with 'library(Matrix); as(x, "dgCMatrix")' ...

Martin Maechler maechler at stat.math.ethz.ch
Fri Jul 7 10:34:07 CEST 2006


>>>>> "JohnT" == Thaden, John J <ThadenJohnJ at uams.edu>
>>>>>     on Thu, 6 Jul 2006 12:29:42 -0500 writes:

    JohnT> Martin Maechler replied to my query "Warning while subsetting...":
    MartinM> >>>>> "JohnT" == Thaden, John J <ThadenJohnJ at uams.edu>
    MartinM> >>>>>     on Thu, 6 Jul 2006 00:02:10 -0500 writes:

    JohnT> ...
    JohnT> > # While subsetting x, I was surprised to get this warning: 
    JohnT> > y<-x[1:300,]
    JohnT> Warning message:
    JohnT> number of items to replace is not a multiple of
    JohnT> replacement length

    MartinM> and later

    JohnT> Sorry, I omitted background information:
    JohnT> R version: 2.3.0
    JohnT> OS: Windows XP
    JohnT> CPU:  Pentium III, 
    JohnT> RAM:  768 MB

    MartinM> You omitted the most pertinent information: The 
    MartinM> version of 'Matrix' you are using.
    MartinM> The latest released version of Matrix does
    MartinM> *not* show the behavior you mentioned. {So I have 
    MartinM> now spent 20 minutes just because you did not 
    MartinM> update 'Matrix'..}

    JohnT> The Matrix package was version 0.995-10, now is 0.995-11. 
    JohnT> The R base was version 2.3.0, now is 2.3.1. 
    JohnT> Subsetting 'y <- x[1:300,]' now works. Please accept my apology.

    JohnT> Also, what command-line memory settings might prevent
    JohnT> R from crashing while using the Matrix package to 
    JohnT> convert my 600 X 4482 dgTMatrix to the dgCMatrix class
    JohnT> or to an expanded Matrix, via the as() function? I can
    JohnT> do this with half of the matrix, 300 x 4482.

    MartinM> It's hard to believe that you get a "crash" 
    MartinM> when coercing to 'dgC' -- but of course this 
    MartinM> really depends how much memory you have already
    MartinM> goggled up by other large objects in your R
    MartinM> workspace, or by other applications running at
    MartinM> the same time in Windows.  Coercing to a full 
    MartinM> matrix will of course require 8 * 601 * 4482 = 
    MartinM> 21549456 extra bytes just for the numbers.
    MartinM> That's only 21.5 Megabytes, so I wonder..
    MartinM> 
    MartinM> I have never seen R crashes from using 'Matrix', 

 (actually that's not even true; at some point in time we had a
  bug in 'Matrix' which lead to spurious segmentation faults)

    MartinM> but then I work with an operating system, not 
    MartinM> with M$ Windows. 
    MartinM> 
    MartinM> Maybe you meant you got an error message 
    MartinM> "... memory allocation .."?

    JohnT> Testing again, I closed all applications; disabled antivirus; 
    JohnT> opened RGui; removed all R objects but 'x' (a 600x4482 dgTMatrix); 
    JohnT> opened WinXP's 'Task Manager'; saw only "Rgui" under 
    JohnT> 'Applications'; saw processes using a total of 287 MB of memory
    JohnT> under 'Processes'; closed 'Task Manager'; and typed R commands:

    >> # Steps leading to an R crash...
    >> ls()
    JohnT> [1] "x"
    >> str(x)
    JohnT> Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
    JohnT> ..@ i       : int [1:923636] 1 2 3 4 5 6 7 8 9 10 ...
    JohnT> ..@ j       : int [1:923636] 1 1 1 1 1 1 1 1 1 1 ...
    JohnT> ..@ Dim     : int [1:2] 600 4482
    JohnT> ..@ Dimnames:List of 2
    JohnT> .. ..$ : chr [1:601] "50" "51" "52" "53" ...
    JohnT> .. ..$ : chr [1:4482] "1" "2" "3" "4" ...
    JohnT> ..@ x       : num [1:923636] 50.2 51.2 52.2 53.2 54.2 ...
    JohnT> ..@ factors : list()
    >> gc()
    JohnT> used (Mb) gc trigger (Mb) max used (Mb)
    JohnT> Ncells  183529  5.0     407500 10.9   350000  9.4
    JohnT> Vcells 1928101 14.8    2286173 17.5  1928652 14.8
    >> library(Matrix)
    JohnT> Loading required package: lattice
    >> gc()
    JohnT> used (Mb) gc trigger (Mb) max used (Mb)
    JohnT> Ncells  627772 16.8    1073225 28.7  1073225 28.7
    JohnT> Vcells 2165773 16.6    3345184 25.6  2332013 17.8
    >> search()
    JohnT> [1] ".GlobalEnv" "package:Matrix" "package:lattice"
    JohnT> [4] "package:methods" "package:stats" "package:graphics"  
    JohnT> [7] "package:grDevices" "package:utils" "package:datasets"
    JohnT> [10] "Autoloads"  "package:base"     
    >> #Now the line that causes crashes...
    >> y <- as(x,"dgCMatrix")

    JohnT> After ~10 seconds, R blinks off and a WinXP dialog appears: 

    JohnT> R for Windows GUI front-end has encountered 
    JohnT> a problem and needs to close.  We are sorry
    JohnT> for the inconvenience....Error signature:
    JohnT> AppName: rgui.exe  AppVer: 2.31.38247.0  
    JohnT> ModName: matrix.dll Offset: 0000ff31....
    JohnT> Report error?

Thanks a lot, John, for the more detailed report.
I do wonder how it happens, since the memory allocation is not
really big.   E.g., I can easily solve ``your'' (well, a
simulated version of it) problem on a machine with only 512 MB
RAM:

  library("Matrix")

  ## MM: construct a matrix *as* John's :
  d <- as.integer(c(600,4482))
  n0 <- 923636
  set.seed(1)
  M <- new("dgTMatrix", Dim = d,
	   i = sort(sample(0:(d[1]-1), size = n0, replace = TRUE)),
	   j = sample(0:(d[2]-1), size = n0, replace = TRUE),
	   x = round(rnorm(n0, m = 50, sd = 10), 1))
  dimnames(M) <- list(paste("r", 1:d[1], sep=''),
		      paste("C", 1:d[2], sep=''))
  str(M)

  M1.10 <- M[1:10,] # gave warning in earlier versions of 'Matrix'

  ## on 'nanny' which has just 512 MB  (with other processes active, etc):
  gc()
  ##           used (Mb) gc trigger (Mb) max used (Mb)
  ## Ncells  642690 17.2    1073225 28.7  1073225 28.7
  ## Vcells 3136547 24.0    8305047 63.4  7988501 61.0

  mC <- as(M, "dgCMatrix")
  ##           ---------
  gc()
  ##           used (Mb) gc trigger (Mb) max used (Mb)
  ## Ncells  642721 17.2    1073225 28.7  1073225 28.7
  ## Vcells 4311327 32.9    8305047 63.4  7988501 61.0

  ## well, this will need a bit more memory, but should still work:
  mm <- as(M, "matrix")
  ##           -------
  gc()
  ##-           used (Mb) gc trigger (Mb) max used (Mb)
  ##- Ncells  642725 17.2    1073225 28.7  1073225 28.7
  ##- Vcells 7000528 53.5    8438708 64.4  7988501 61.0


I see in the CHANGES file for {R for Windows}

>> R 2.3.1 patched
>> ===============
>> 
>>  [.........................]
>> 
>> R could crash when very low on memory. (PR#8981)

So, maybe you can try to even run "R 2.3.1 patched" for Windows,
which you can get from here,
      http://cran.us.r-project.org/bin/windows/base/rpatched.html
and see if your crashes go away ?

Regards,
Martin



More information about the R-help mailing list