[R] function for grouping

Petr Savicky savicky at cs.cas.cz
Thu Jan 26 23:46:11 CET 2012


On Thu, Jan 26, 2012 at 08:29:22AM -0800, yan wrote:
> what if I don't need to store the combination results, I just want to get the
> combination result for a given row.
> e.g. for the 5 elements divided into 3 groups , and if I give another
> parameter which is the row number of the results, in petr's example, say if
> I set row number to 1, then I get 1,2,3,1,1.
> 
> So I need a systematic way of generating the combination, but I don't need
> all the combinations in one go.

Hi.

The following function partitioning() computes m-th
partitioning of n elements into "groups" groups in the
lexicographic order.

  partitioning <- function(n, groups, m)
  {
      a <- matrix(nrow=n+1, ncol=groups+2)
      a[, groups + 2] <- 0
      a[1, 0:groups + 1] <- 0
      a[1, groups + 1] <- 1
      for (i in seq.int(from=2, length=n)) {
          for (j in 0:groups) {
              a[i, j + 1] <- j*a[i - 1, j + 1] + a[i - 1, j + 2]
          }
      }
      available <- a[n + 1, 1]
      stopifnot(m <= available)
      x <- rep(NA, times=n)
      for (k in seq.int(length=n)) {
          free <- max(x[seq.int(length=k-1)], 0)
          for (j in seq.int(length=min(free + 1, groups))) {
              subtree <- a[n - k + 1, max(free, j) + 1]
              if (subtree < m) {
                  available <- available - subtree
                  m <- m - subtree
              } else {
                  x[k] <- j
                  break
              }
          }
      }
      x
  }
  
  # the previous code for comparison modified to
  # produce partitionings in the same order
  
  check.row <- function(x, k)
  {
      y <- unique(x)
      length(y) == k && all(y == 1:k)
  }
  
  allPart <- as.matrix(rev(expand.grid(x5=1:3, x4=1:3, x3=1:3, x2=1:2, x1=1)))
  ok <- apply(allPart, 1, check.row, k=3)
  allPart <- allPart[ok, ]
  
  # the comparison itself
  
  for (m in seq.int(length=nrow(allPart)+1)) {
      x <- partitioning(5, 3, m)
      cat(m, x, x - allPart[m, ], "\n")
  }

  # the output is 

  1 1 1 1 2 3 0 0 0 0 0 
  2 1 1 2 1 3 0 0 0 0 0 
  3 1 1 2 2 3 0 0 0 0 0 
  4 1 1 2 3 1 0 0 0 0 0 
  5 1 1 2 3 2 0 0 0 0 0 
  6 1 1 2 3 3 0 0 0 0 0 
  7 1 2 1 1 3 0 0 0 0 0 
  8 1 2 1 2 3 0 0 0 0 0 
  9 1 2 1 3 1 0 0 0 0 0 
  10 1 2 1 3 2 0 0 0 0 0 
  11 1 2 1 3 3 0 0 0 0 0 
  12 1 2 2 1 3 0 0 0 0 0 
  13 1 2 2 2 3 0 0 0 0 0 
  14 1 2 2 3 1 0 0 0 0 0 
  15 1 2 2 3 2 0 0 0 0 0 
  16 1 2 2 3 3 0 0 0 0 0 
  17 1 2 3 1 1 0 0 0 0 0 
  18 1 2 3 1 2 0 0 0 0 0 
  19 1 2 3 1 3 0 0 0 0 0 
  20 1 2 3 2 1 0 0 0 0 0 
  21 1 2 3 2 2 0 0 0 0 0 
  22 1 2 3 2 3 0 0 0 0 0 
  23 1 2 3 3 1 0 0 0 0 0 
  24 1 2 3 3 2 0 0 0 0 0 
  25 1 2 3 3 3 0 0 0 0 0 
  Error: m <= available is not TRUE

The zeros confirm that partitioning(5, 3, m) is exactly
m-th row of allPart. The error at the end shows that
the function partitioning() recognizes the situation
that m is larger than the number of partitionings with
the given parameters.

The idea of the algorithm is that it searches through
the tree of all encodings of the partitionings and
the precomputed matrix a[,] allows to determine the
number of leaves under any node of the tree. Using
this, the algorithm can choose the right branch at
each node.

Hope this helps.

Petr.



More information about the R-help mailing list