[R] Compact Indicator Matrices

amarkos amarkos at gmail.com
Sun May 11 16:49:47 CEST 2008


On May 11, 4:47 pm, "Douglas Bates" <ba... at stat.wisc.edu> wrote:

> Do you mean that you want to collapse similar rows into a single row
> and perhaps a count of the number of times that this row occurs?

Let me rephrase the problem by providing an example.

Input:

A =
      [,1] [,2]
 [1,]    1    1
 [2,]    1    3
 [3,]    2    1
 [4,]    1    2
 [5,]    2    1
 [6,]    1    2
 [7,]    1    1
 [8,]    1    2
 [9,]    1    3
[10,]    2    1

# Indicator matrix
A <- data.frame(lapply(data.frame(obj), as.factor))

nocases <- dim(obj)[1]
novars  <- dim(obj)[2]

# variable levels
levels.n <- sapply(obj, nlevels)
n        <- cumsum(levels.n)

# Indicator matrix calculations
Z        <- matrix(0, nrow = nocases, ncol = n[length(n)])
newdat   <- lapply(obj, as.numeric)
offset   <- (c(0, n[-length(n)]))
for (i in 1:novars)
  Z[1:nocases + (nocases * (offset[i] + newdat[[i]] - 1))] <- 1

#######

Output:

Z =

    [,1] [,2] [,3] [,4] [,5]
 [1,]    1    0    1    0    0
 [2,]    1    0    0    0    1
 [3,]    0    1    1    0    0
 [4,]    1    0    0    1    0
 [5,]    0    1    1    0    0
 [6,]    1    0    0    1    0
 [7,]    1    0    1    0    0
 [8,]    1    0    0    1    0
 [9,]    1    0    0    0    1
[10,]    0    1    1    0    0


Z is an indicator matrix in the Multiple Correspondence Analysis
framework.
My problem is to collapse identical rows (e.g. 2 and 9) into a single
row and
store the row ids.



More information about the R-help mailing list