[R] Creating a sparse matrix from a file

Martin Maechler maechler at stat.math.ethz.ch
Wed Oct 28 14:52:52 CET 2009


>>>>> "PP" == Pallavi P <pallavip.05 at gmail.com>
>>>>>     on Wed, 28 Oct 2009 16:30:25 +0530 writes:

    PP> Hi Martin,
    PP> I followed your example on my set  of data. Which has non zero values in
    PP> 300k positions in 22638 X 80914 sparse matrix. I am able to load data into a
    PP> field and was able to do some operations (essentially  t(m) %*% m). However,
    PP> when I tried to display the value in the resulted matrix. I am getting below
    PP> error
    PP> *
    PP> Error in asMethod(object) :
    PP> Cholmod error 'out of memory' at file:../Core/cholmod_memory.c, line 148*

    PP> The sequence of commands I used are:

    >> uac=read.table('C:\\personal\\code\\data\\user_album_count.csv',sep=',' ,
    PP> header=T)
    >> library(Matrix)
    >> m<-sparseMatrix(i=uac[,"user"],j=uac[,"item"],x=uac[,"count"])
    >> cm<-t(m) %*% m

The above is less efficient than

    cm <- crossprod(m)

please use the latter {not just for sparse matrices; for all
matrices in R !}

    PP> upto this point, I was able to run, however when I tried to display cm[1,1],
    PP> I got above error. Kindly let me know if there is anything wrong going on
    PP> here.

Interestingly, we had a recent thread on R-devel,
which also made a point about excessive memory usage when
accessing elements of a sparse matrix.

I'd really like to investigate further;
but can you ***PLEASE*** use reproducible code, i.e.,
similar to the one I used, rather than reading data from one of
your files.

Note that your matrix is still fine and should be able to work
with it, even thoug it seems the operation

  a <- cm[1,1]

is currently implemented very sub-optimally.

I'm busy for the rest of today with other duties,
but am looking forward to receive **reproducible** code from
you, by tonight.
Also, please do not forget to also show the result of 
sessionInfo() !

Martin Maechler,

    PP> Thanks
    PP> Pallavi

    PP> On Tue, Oct 27, 2009 at 8:34 PM, Martin Maechler <maechler at stat.math.ethz.ch
    >> wrote:

    >> >>>>> "PP" == Pallavi P <pallavip.05 at gmail.com>
    >> >>>>>     on Tue, 27 Oct 2009 18:13:22 +0530 writes:
    >> 
    PP> Hi Martin,
    PP> Thanks for the help. Just to make sure I understand correctly.
    >> 
    PP> The below steps are for creating an example table similar to the one
    >> that I
    PP> read from file.
    >> 
    >> yes, exactly
    >> 
    >> n <- 22638
    >> m <- 80914
    >> nnz <- 300000 # no idea if this is realistic for you
    >> 
    >> set.seed(101)
    >> ex <- cbind(i = sample(n,nnz, replace=TRUE),
    >> j = sample(m,nnz, replace=TRUE),
    >> x = round(100 * rnorm(nnz)))
    >> 
    >> 
    PP> and I can understand the way sparseMatrix is initialized right now
    >> as
    >> M <- sparseMatrix(i = ex[,"i"],
    >> j = ex[,"j"],
    >> x = ex[,"x"])
    >> 
    PP> How ever, I couldn't understand the use of below commands.
    >> 
    >> MM. <- tcrossprod(M) # == MM' := M %*% t(M)
    >> M.1 <- M %*% rep(1, ncol(M))
    >> stopifnot(identical(drop(M.1), rowSums(M)))
    >> 
    >> They were just for illustrative purposes,
    >> to show how and that you can work with the created sparse matrix
    >> 'M'.
    >> 
    >> Regards,
    >> Martin Maechler, ETH Zurich
    >> 
    PP> Kindly let me know if I missed something.
    >> 
    PP> Thanks
    PP> Pallavi
    >> 
    >> 
    PP> Hi Martin,<br><br>I followed your example on my set  of data. Which has non zero values in 300k positions in 22638 X 80914 sparse matrix. I am able to load data into a field and was able to do some operations (essentially  t(m) %*% m). However, when I tried to display the value in the resulted matrix. I am getting below error<br>
    PP> <b><br>Error in asMethod(object) : <br>  Cholmod error 'out of memory' at file:../Core/cholmod_memory.c, line 148</b><br><br>The sequence of commands I used are:<br><br>>uac=read.table('C:\\personal\\code\\data\\user_album_count.csv',sep=',' , header=T)<br>
    PP> >library(Matrix)<br>>m<-sparseMatrix(i=uac[,"user"],j=uac[,"item"],x=uac[,"count"])<br>>cm<-t(m) %*% m<br>upto this point, I was able to run, however when I tried to display cm[1,1], I got above error. Kindly let me know if there is anything wrong going on here.<br>
    PP> <br>Thanks<br>Pallavi<br><br><div class="gmail_quote">On Tue, Oct 27, 2009 at 8:34 PM, Martin Maechler <span dir="ltr"><<a href="mailto:maechler at stat.math.ethz.ch">maechler at stat.math.ethz.ch</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
    PP> >>>>> "PP" == Pallavi P <<a href="mailto:pallavip.05 at gmail.com">pallavip.05 at gmail.com</a>><br>
    PP> >>>>>     on Tue, 27 Oct 2009 18:13:22 +0530 writes:<br>
    PP> <br>
    PP>    PP> Hi Martin,<br>
    PP>    PP> Thanks for the help. Just to make sure I understand correctly.<br>
    PP> <br>
    PP>    PP> The below steps are for creating an example table similar to the one that I<br>
    PP>    PP> read from file.<br>
    PP> <br>
    PP> yes, exactly<br>
    PP> <div class="im"><br>
    PP>     n <- 22638<br>
    PP>     m <- 80914<br>
    PP>     nnz <- 300000 # no idea if this is realistic for you<br>
    PP> <br>
    PP>     set.seed(101)<br>
    PP>     ex <- cbind(i = sample(n,nnz, replace=TRUE),<br>
    PP>     j = sample(m,nnz, replace=TRUE),<br>
    PP>     x = round(100 * rnorm(nnz)))<br>
    PP> <br>
    PP> <br>
    PP> </div>    PP> and I can understand the way sparseMatrix is initialized right now as<br>
    PP> <div class="im">    M <- sparseMatrix(i = ex[,"i"],<br>
    PP>                      j = ex[,"j"],<br>
    PP>                      x = ex[,"x"])<br>
    PP> <br>
    PP> </div>    PP> How ever, I couldn't understand the use of below commands.<br>
    PP> <div class="im"><br>
    PP>   MM. <- tcrossprod(M) # == MM' := M %*% t(M)<br>
    PP>   M.1 <- M %*% rep(1, ncol(M))<br>
    PP>   stopifnot(identical(drop(M.1), rowSums(M)))<br>
    PP> <br>
    PP> </div>They were just for illustrative purposes,<br>
    PP> to show how and that you can work with the created sparse matrix<br>
    PP> 'M'.<br>
    PP> <br>
    PP> Regards,<br>
    PP> Martin Maechler, ETH Zurich<br>
    PP> <br>
    PP>    PP> Kindly let me know if I missed something.<br>
    PP> <br>
    PP>    PP> Thanks<br>
    PP>    PP> Pallavi<br>
    PP>   <br></blockquote></div><br>




More information about the R-help mailing list