[R] writing spdiags function for R

Mon Aug 26 11:37:30 CEST 2013

Dear R-list,

l have been working on a translation of a matlab library into R, it took me
a while but I am almost there to submit it to CRAN...however, for this
library to be computationally competitive I need to solve an issue with the
home-made R version of the spdiags.m function (an old issue coming back).
What I have now working is: 

# A = rbind(c(1,3,1,0), c(1,1,0,2), c(2,1,0,0) )
# A = replicate(100, rnorm(50)) 
A = replicate(1000, rnorm(1000)) 

exdiag <- function(mat, off) {mat[row(mat)+off == col(mat)]}
spdiags <- function(A){
    require(Matrix)

    indx = which(A != 0, arr.ind = T)
    i = indx[,1]; j = indx[,2]

    d = sort(j-i);
    d = d[which(diff(c(-Inf, d) ) != 0)]

    m = nrow(A); n = ncol(A);
    p = length(d)
    empty = vector()

    A = as.matrix(A) ## make A logical, otherwise exdiag throws horrible
warnings
   ## might have to condition the above command on the type of input matrix

    B =  Matrix(0, nrow = min(c(m,n)), ncol = p, sparse = TRUE);   

    system.time(
        for (k in 1:p){
            print(k)

            if (m >= n){
                i = max(c(1, 1 + d[k])):min(c(n, m + d[k]) )
            } else {
                i = max(c(1, 1 - d[k])):min(c(m, n-d[k]) )
            }

            if (length(i) != 0){

                B[i, k] = exdiag(A, d[k])

         #  print(B)
         #  print(i)
         #  print(k)

            }

        }
        )

   return (list( B = B, d = d) )

}

the advantages are: 1) that it works with matrices of any dimension
(previous versions worked only with squared matrices).  2) it does not need
other functions, beside exdiag to work.

However, I run into serious issues with computational efficiency. As I am
extracting non-zero diagonals and filling B in a for loop, the time it takes
to complete each iteration highly depends on the dimensions of the matrix. 
Try the different starting A matrices to test this. 

Unfortunately, I really need to optimize this aspect, as the people who I am
expecting to use this library would have really long time-series, i.e. it
would make .m win over .R

So, as I am wondering if there is a more efficient solution to do this
(perhaps using apply?) that I cannot currently see. Perhaps using sparse
matrices could help, but exdiag throws a warning:  "<sparse>[ <logic> ] :
.M.sub.i.logical() maybe inefficient" (noted also in previous responses to
this post of mine).

Using the function ?band has the drawback of padding the matrix with 0
(except the diagonal considered, e.g., band(A,1,1) ). Often, however, you
want to keep the 0s along the diagonal of interest, distinguished from the
0s of the other diagonals. Thus, doing something like:  which(band(A,1,1) !=
0, arr.ind = TRUE)  to get the non-zero element would also discard the 0s
along the diagonal.

The nice getband,R posted by Ben above, makes a combined use of diag and
band, together with a system of indeces, and tries to work around the issue
of the zero padding.  However, I did not manage to use/adapt this function
to work with matrices that are not squared. And, as I am not really 100% a
programmer, I did not find a solution to it... 
Moreover, I believe the problem of computational efficiency would still be
there, even fixing the dimensionality issue.

I really hope somebody out there can help me to figure this out. 

thanks a lot for your precious time and attention,

Moreno

--
View this message in context: http://r.789695.n4.nabble.com/writing-spdiags-function-for-R-tp4552626p4674540.html
Sent from the R help mailing list archive at Nabble.com.