[Rd] Subsetting "dspMatrix" without coercion to "matrix"

Mikael Jagan j@g@nmn2 @end|ng |rom gm@||@com
Wed Nov 17 06:13:57 CET 2021


Currently, the class for dense symmetric matrices in packed storage, 
"dspMatrix", inherits its subset (i.e., `[`) methods from the "Matrix" 
class. As a result, subsetting "dspMatrix" requires coercion to 
"matrix". If memory cannot be allocated for this "matrix", then an error 
results.

n <- 30000L
x <- as.double(seq_len(choose(n + 1L, 2L)))
S <- new("dspMatrix", uplo = "L", x = x, Dim = c(n, n))

S[, 1L]
# Error: vector memory exhausted (limit reached?)

traceback()
# 10: .dsy2mat(dsp2dsy(from))
# 9: asMethod(object)
# 8: as(x, "matrix")
# 7: x[, j = j, drop = TRUE]
# 6: x[, j = j, drop = TRUE]
# 5: eval(call, parent.frame())
# 4: eval(call, parent.frame())
# 3: callGeneric(x, , j = j, drop = TRUE)
# 2: S[, 1L]
# 1: S[, 1L]

This seems entirely avoidable, given that there is a relatively simple 
formula for converting 2-ary indices [i,j] of S to 1-ary indices k of 
S[lower.tri(S, TRUE)]:

k <- i + round(0.5 * (2L * n - j) * (j - 1L)) # for i >= j

Has the implementation of `[` for class "dspMatrix" been discussed 
already elsewhere? If not, shall I report it to Bugzilla as a wishlist item?

FWIW, I encountered this problem while trying to coerce a large "dist" 
object to "dspMatrix", expecting that I would be able to safely subset 
the result:

setAs(from = "dist", to = "dspMatrix", function(from) {
   p <- length(from)
   if (p > 0L) {
     n <- as.integer(round(0.5 * (1 + sqrt(1 + 8 * p)))) # p = n*(n-1)/2
     x <- double(p + n)
     i <- 1L + c(0L, cumsum(n:2L))
     x[-i] <- from
   } else {
     n <- 1L
     x <- 0
   }
   new("dspMatrix", uplo = "L", x = x, Dim = c(n, n))
})

But that attempt failed for a entirely different reason. For large 
enough n, I would hit a memory limit at the line:

x[-i] <- from

So I guess my second question is: what is problematic about this 
benign-seeming subset-assignment?

Mikael



More information about the R-devel mailing list