[R] Sparse matrix no longer sparse (Matrix Package)
Doran, Harold
HDoran at air.org
Fri Jul 12 15:24:28 CEST 2013
Here is code to completely replicate the issue with comments. I remain confused why simply changing one element of the ddi matrix to be non-integer changes two things: 1) It changes the class of the object I need (A Inverse) and it increases its memory.
Ideally, A inverse will remain stored as a sparse matrix no matter what (as it is sparse in my real world problem). When it is converted to a dense object, it blows up in memory and my R program halts.
library(Matrix)
### Create a symmetric matrix of class dsCMatrix
A <- diag(5, 10)
A[1, 5] <- A[5,1] <- 2
### Create a diagonal matrix of class ddi
D <- Diagonal(10, 50)
### This returns the inverse of A stored as a sparse matrix
### In my real world problem it consumes almost no memory at all
### this is the ideal type
A <- A %*%D
(aa <- solve(A))
class(aa)
object.size(aa)
### Now, let's only change one element of D to be non-integer
D[1] <- 1.5
### Notice here the inverse of the matrix A
### is now stored as a different object class than before
### even though the pattern of 0s and non-zeros remains the same
### It also increases in memory size
### In my real world problem, the matrix increases from
### about .03 megabytes to almost 2 megabytes and this causes R to choke and die
A <- A %*% D
(aa <- solve(A))
class(aa)
object.size(aa)
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jeff Newmiller
Sent: Thursday, July 11, 2013 3:22 PM
To: John Kane; r-help at r-project.org
Cc: dmbates at gmail.com; maechler at stat.math.ethz.ch
Subject: Re: [R] Sparse matrix no longer sparse (Matrix Package)
It seems to me that this issue should be reproducible with a small matrix, since the concern is the representation rather than the values.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
John Kane <jrkrideau at inbox.com> wrote:
>Just about anything I knew about matrices, I forgot years ago so I'm no
>help here but I'd suggest putting the matrix on something like
>Mediafire http://www.mediafire.com/ or Dropbox https://www.dropbox.com
>so people can download it and have a look.
>
>I agree that dput() is not really good for "big" data sets.
>
>Kingston ON Canada
>
>
>> -----Original Message-----
>> From: hdoran at air.org
>> Sent: Thu, 11 Jul 2013 17:10:54 +0000
>> To: hdoran at air.org, jrkrideau at inbox.com, r-help at r-project.org
>> Subject: RE: [R] Sparse matrix no longer sparse (Matrix Package)
>>
>> This is a terrible example as I didn't realize my code actually does
>> create a non-symmetric matrix and in this case the function behaves
>as
>> expected. Nonetheless, my original issue stands and that issue still
>does
>> not make sense.
>>
>> Apologies for bad example code.
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>[mailto:r-help-bounces at r-project.org]
>> On Behalf Of Doran, Harold
>> Sent: Thursday, July 11, 2013 11:36 AM
>> To: 'John Kane'; r-help at r-project.org
>> Cc: dmbates at gmail.com; maechler at stat.math.ethz.ch
>> Subject: Re: [R] Sparse matrix no longer sparse (Matrix Package)
>>
>> Thank you, John. I originally used dput() but the output is huge.
>> However, here is a reproducible example of what I think very
>unexpected
>> behavior of some matrix functions.
>>
>>> ### Create a symmetric matrix of class dsCMatrix A <- as(diag(5,
>10),
>>> 'dsCMatrix') A[1, 5] <- A[5,1] <- 2
>>>
>>> ### Create a diagonal matrix of class ddi D <- Diagonal(10, 1)
>>>
>>> ### This works as it should
>>> aa <- Cholesky(A %*% D)
>>>
>>> ### Now, let's only change one element of D to be non-integer D[1]
><-
>>> 1.5
>>>
>>> ### AD is still symmetric, but here the Cholesky function complains
>>> that it is not aa <- Cholesky(A %*% D)
>> Error in Cholesky(as(A, "symmetricMatrix"), perm = perm, LDL = LDL,
>super
>> = super, :
>> error in evaluating the argument 'A' in selecting a method for
>function
>> 'Cholesky': Error in asMethod(object) :
>> not a symmetric matrix; consider forceSymmetric() or symmpart()
>>
>> ### For fun try this
>>
>>> L <- update(aa, as(A %*% D, 'symmetricMatrix'))
>> Error in asMethod(object) :
>> not a symmetric matrix; consider forceSymmetric() or symmpart()
>>
>> ### This does indeed work, but should I need to implement this step?
>>
>> Cholesky(forceSymmetric(A %*% D))
>>
>> So, there is something about changing the elements of a ddi matrix
>that
>> causes subsequent problems. Is there a good reason this occurs and
>> something I should be doing differently, or is this a bug?
>>
>> Thanks
>>
>> -----Original Message-----
>> From: John Kane [mailto:jrkrideau at inbox.com]
>> Sent: Thursday, July 11, 2013 10:57 AM
>> To: Doran, Harold; r-help at r-project.org
>> Subject: RE: [R] Sparse matrix no longer sparse (Matrix Package)
>>
>> The message got through but not the attachment. The R help list tends
>to
>> strip off attachements for security reasons. Files of types txt,
>png, &
>> pdf should get through.
>>
>> In most cases the accepted method of sending data is to use the
>dput()
>> function to output a file in the console and then copy and paste the
>> results into your email.
>>
>> So for file "dat1" one would just use dput(dat1) and paste the
>results
>> into an email.
>>
>> John Kane
>> Kingston ON Canada
>>
>>
>>> -----Original Message-----
>>> From: hdoran at air.org
>>> Sent: Thu, 11 Jul 2013 09:53:40 +0000
>>> To: r-help at r-project.org
>>> Subject: [R] Sparse matrix no longer sparse (Matrix Package)
>>>
>>> I sent this message yesterday with an attachment allowing for
>>> reproduction of the issue. But I think the attachment is preventing
>>> the message from coming through. If anyone is interested I will
>>> forward the attachment directly allowing for you to reproduce the
>>> issue I observe.
>>>
>>> On 7/10/13 2:38 PM, "Doran, Harold" <HDoran at air.org> wrote:
>>>
>>> >I have zero'd in on what appears to be the issue. This seems to be
>a
>>> >bug in Matrix, but I am not sure yet. I am attaching files that
>would
>>> >allow others to replicate this with my toy data.
>>>>
>>> >Notice the elements of D1 in the attached data are all integers. It
>>> >is a sparse, diagonal matrix.
>>>>
>>>>> library(Matrix)
>>>>> class(D1)
>>> >[1] "ddiMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>
>>> >Now, I find the inverse of the matrix A as follows:
>>>>> A <- Ir + ZtZ %*% D1
>>>>> A.Inv <- solve(A, Ir)
>>>>
>>> >Notice now the inverse of A remains a dgCMatrix and it is
>relatively
>>> >small in size, only 33424 bytes.
>>>>> class(A.Inv)
>>> >[1] "dgCMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>
>>>>> object.size(A.Inv)
>>> >33424 bytes
>>>>
>>> >Now, if I change an element of the matrix D1 to be non-integer, D1
>>> >still has the same class as it did before
>>>>
>>>>> D1[1] <- 1.2
>>>>
>>>>> class(D1)
>>> >[1] "ddiMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>
>>> >Now, if I use this new version of D1 in the same calculations as
>>> >above, notice that A.Inv is no longer a dgCMatrix but instead
>becomes
>>> >a dgeMatrix. It then increases from an object of size 33424 bytes
>to
>>> >an object of size 2001112 bytes!
>>>>
>>>>> A <- Ir + ZtZ %*% D1
>>>>> A.Inv <- solve(A, Ir)
>>>>> class(A.Inv)
>>> >[1] "dgeMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>> object.size(A.Inv)
>>> >2001112 bytes
>>>>
>>> >What I desire is that the object A.Inv remain sparse at all times
>and
>>> not
>>> >become dense. But, perhaps there is a reason this change occurs
>that
>>> >I don't fully understand.
>>>>
>>> >I can of course coerce it back to a sparse matrix and it reduces
>back
>>> >in size.
>>>>> object.size(as(A.Inv, 'sparseMatrix'))
>>> >33424 bytes
>>>>
>>> >I of course recognize it requires more memory to store floating
>>> >points than integers, but is this large increase on the order of
>>> >magnitude that seems about right?
>>>>
>>> >Is there a reason the floating point in D1 causes for A.Inv to no
>>> >longer remain sparse?
>>>>
>>> >Thank you for your help,
>>> >Harold
>>>>
>>>>
>>>>
>>>>
>>>>
>>> >-----Original Message-----
>>> >From: r-help-bounces at r-project.org
>>> >[mailto:r-help-bounces at r-project.org]
>>> >On Behalf Of Doran, Harold
>>> >Sent: Wednesday, July 10, 2013 11:42 AM
>>> >To: r-help at r-project.org
>>> >Cc: dmbates at gmail.com; 'maechler at stat.math.ethz.ch'
>>> >Subject: [R] Sparse matrix no longer sparse (Matrix Package)
>>>>
>>> >I have a large function computing an iterative algorithm for
>fitting
>>> >mixed linear models. Almost all code relies on functions from the
>>> >Matrix package. I've come across an issue that I do not believe
>>> >previously occurred in earlier versions of R or Matrix.
>>>>
>>> >I have a large, sparse matrix, A as
>>>>
>>>>> class(A);dim(A)
>>> >[1] "dgCMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>> >[1] 12312 12312
>>>>
>>> >I am in a position where I must find its inverse. I realize this
>is
>>> less
>>> >than ideal, and I have two ways of doing this
>>>>
>>> >A.Inv <- solve(A, Ir) or just solve(A)
>>>>
>>> >Where Ir is an identity matrix with the same dimensions as A and it
>>> >is also sparse
>>>>
>>>>> class(Ir)
>>> >[1] "ddiMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>
>>> >The issue, however, is that the inverse of A is converted into a
>>> >dense matrix and this becomes a huge memory hog, causing the rest
>of
>>> >the algorithm to fail. In prior versions this remained as a sparse
>>> matrix.
>>>>
>>>>> A.Inv[1:5, 1:5]
>>> >5 x 5 Matrix of class "dgeMatrix"
>>>> [,1] [,2] [,3] [,4] [,5]
>>> >[1,] 0.6878713 0.0000000 0.0000000 0.0000000 0.0000000 [2,]
>0.0000000
>>> >0.6718767 0.0000000 0.0000000 0.0000000 [3,] 0.0000000 0.0000000
>>> >0.5076945 0.0000000 0.0000000 [4,] 0.0000000 0.0000000 0.0000000
>>> >0.2324122 0.0000000 [5,] 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.2139975
>>>>
>>> >I could coerce this matrix to become sparse such as
>>>>
>>>>> AA <- as(A.Inv, 'sparseMatrix')
>>>>> class(AA)
>>> >[1] "dgCMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>
>>>>> AA[1:5, 1:5]
>>> >5 x 5 sparse Matrix of class "dgCMatrix"
>>>>
>>> >[1,] 0.6878713 . . . .
>>> >[2,] . 0.6718767 . . .
>>> >[3,] . . 0.5076945 . .
>>> >[4,] . . . 0.2324122 .
>>> >[5,] . . . . 0.2139975
>>>>
>>> >But I don't think this is best.
>>>>
>>> >So, my question is why is a matrix that is sparse turning into a
>>> >dense matrix? Can I avoid that and keep it sparse without having to
>>> >coerce it to be sparse after it is created?
>>>>
>>> >Thank you very much
>>> >Harold
>>>>
>>>>
>>>>> sessionInfo()
>>> >R version 3.0.1 (2013-05-16)
>>> >Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>
>>> >locale:
>>> >[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>>> >States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> >[5] LC_TIME=English_United States.1252
>>>>
>>> >attached base packages:
>>> >[1] stats graphics grDevices utils datasets methods
>base
>>>>
>>> >other attached packages:
>>> >[1] lme4_0.999999-2 Matrix_1.0-12 lattice_0.20-15
>>>>
>>> >loaded via a namespace (and not attached):
>>> >[1] grid_3.0.1 nlme_3.1-109 stats4_3.0.1 tools_3.0.1
>>>>
>>> > [[alternative HTML version deleted]]
>>>>
>>> >______________________________________________
>>> >R-help at r-project.org mailing list
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>____________________________________________________________
>GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at
>http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!®
>Messenger, ICQ®, Google Talk™ and most webmails
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list