[Rd] (no subject)

Wed Feb 1 18:25:10 CET 2006

Suppose X is a data.frame with n obs and k vars, all variables are  
factors.

tab <- table(X)

containes a k-dim array

I would like to get a list from tab. This list is such that, each  
element contain the indexes corresponding to the observations which  
are in the same cell of this k-dim array. Of course, only for non  
empty cell.

E.g.

 > set.seed(123)
 > X <- as.data.frame(matrix(rnorm(5000),100,5))
 > X$V1 <- cut(X$V1, br=5)
 > X$V2 <- cut(X$V2, br=5)
 > X$V3 <- cut(X$V3, br=5)
 > X$V4 <- cut(X$V4, br=5)
 > X$V5 <- cut(X$V5, br=5)
 > tab <- table(X)
 > which(tab>0) -> cells
 > length(cells)
[1] 94

thus, of course, 94 cells over 5^5 = 3125 are non empty.
I would like a smart way (without reimplementing table/tabulate) to  
get the list of length 94 which contains the indexes of the obs in  
each cell
Or, viceversa, a vector of length n which tells, observation by  
observation,  which cell (out of the 3125) the observation is in.
stefano