[R] operations on sparse matrices, and dense intermediary steps

Jose Quesada quesada at gmail.com
Sat Oct 24 23:49:08 CEST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I'm doing some basic operations on large sparse matrices, for example
getting a row.
it takes  close to 30 seconds on a 3Ghz machine, and shots the memory
usage up to the sky.
I suspect there are dense intermediary steps (which, if true would
defeat the purpose of trying to use sparse representaitons).

As much as I try understanding the hierarchy of Matrix classes, it's a
mystery to me.
Is subsetting sparse matrices memory-intensive? Does it have to do with
features of the language, such as pass-by-value bu default?
Or am I doing something inneficent without knowing?

(Note: example below would only work with 64bit R and lots of memory;
reduce size of matrix 2-3 orders of magnitude for 32-bit R)
## libraries
library(Matrix)


rSpMatrix <- function(nrow, ncol, nnz, rand.x = function(n)
round(rnorm(nnz), 2))
{
		## Purpose: random sparse matrix
		## ----------------------------------------------------------------------
		## Arguments: (nrow,ncol): dimension
		##          nnz  :  number of non-zero entries
		##         rand.x:  random number generator for 'x' slot
		## ----------------------------------------------------------------------
		## Author: Martin Maechler, Date: 14.-16. May 2007
		stopifnot((nnz <- as.integer(nnz)) >= 0,
							nrow >= 0, ncol >= 0,
							nnz <= nrow * ncol)
		spMatrix(nrow, ncol,
						 i = sample(nrow, nnz, replace = TRUE),
						 j = sample(ncol, nnz, replace = TRUE),
						 x = rand.x(nnz))
}


## (original ir a term x doc matrix of the entire wikipedia)
mm <- rSpMatrix(793251, 1027355, nnz = 205746204)
# make it column based:
mm <- as(mm, "CsparseMatrix")
a=mm[1,,drop=F]#this takes close to 30 seconds on a 3Ghz machine

Thanks,
- -Jose
- --
=== I'm learning a new keyboard layout. I type slowly. Please excuse
typos and slower-that-usual responses. Thanks for your patience===
Jose Quesada, PhD.

Max Planck Institute,
Center for Adaptive Behavior and Cognition -ABC-,
Lentzeallee 94, office 224, 14195 Berlin

http://www.josequesada.name/
http://twitter.com/Quesada

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)

iQIcBAEBAgAGBQJK43ZUAAoJEGMobUVGH+HKsZEP/1+R/x2Qv9Kc1LYNP01Hvirz
n8ZVz/fGm+uCt/n4F+cXdXntZ3yoUVZcPy7LwMIo5W4x/saS5cMjBg6Z8hsDvXEQ
2KiiPYJ46C6hEOGJwrpYfOZwm7t5aIMMhaP182P19ziKhpp/Gn4joTnXUzigNHzt
vIPBOBgdZdPvC0h9ByvtCmMSX3roQPv1nMIojrPC0+EzfIVVNIcsfQYYacOyhcuM
/RR97aOzHvcdCln7FbrIT2I0SeeVH3scFGN7q7KFi6Sy+KZmQsv7FxfnUAvf5R+l
DAcm0ekeThksSmJE/Td1220ZguaORjyMHwAKfJH+wiXei24+N+Xf22469g2gCEWb
hiNUzhLHXOSY6mKZ80LKMbhUM4JhKs7K1HImwQmVDa/1UU1WwsjrzZ2fHHqnsjQi
Uysrttu1nbTT5Yvn9CT8gedM7A78sIddjpi1PavbRVJl7/eDN5PGgwilQ70DNetJ
PY8QvLGlA4GGtdvTzxFVP2VK0QgfxmRedrEwqxR1AlIZRn9iK8jrZCXTGLeXe5LX
BX7PQXs11ZfuW/kzDBstqobCrxSERRb/HP5BlY+mKmZ0SieoVkpLeJ7PdeHu+r31
8bsZPkeHaO5MR1KrG155JOK1vgmHPpfSiq0lNUT99hncoFKBvweDnE3Etx+tBra7
h9lxbDeHl6xkbCJj/LDb
=ZVWO
-----END PGP SIGNATURE-----




More information about the R-help mailing list