[R] Help on comparing two matrices

Daniel Nordlund djnordlund at verizon.net
Tue Aug 25 07:01:33 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Michael Kogan
> Sent: Saturday, August 22, 2009 11:45 AM
> To: r-help at r-project.org
> Subject: [R] Help on comparing two matrices
> 
> Hi,
> 
> I need to compare two matrices with each other. If you can get one of 
> them out of the other one by resorting the rows and/or the 
> columns, then 
> both of them are equal, otherwise they're not. A matrix could 
> look like 
> this:
> 
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
> [1,]    0    1    1    1    0    1    1    0
> [2,]    1    1    0    0    0    1    0    1
> [3,]    1    0    1    0    0    0    1    1
> [4,]    1    1    0    0    1    0    0    0
> [5,]    1    0    1    1    1    0    0    0
> [6,]    0    1    0    1    1    0    0    0
> [7,]    0    0    0    0    0    1    1    1
> 
> Note that each matrix consists of ones and zeros, in each row and in 
> each column there are at least three ones and one zero and 
> each pair of 
> rows/columns may have at most two  positions where both are 
> ones (e.g. 
> for the 1. and 2. rows those positions are 2 and 6).
> 
> I was advised to sort both matrices in the same way and then 
> to compare 
> them element by element. But I don't manage to get them sorted... My 
> approach is as following:
> 
> 1. Sort the rows after the row sums (greater sums first).
> 2. Sort the columns after the first column (columns with ones in the 
> first row go left, columns with zeros go right).
> 3. Save the left part (all columns with ones in the first 
> row) and the 
> right part in separate matrices.
> 4. Repeat steps 2 and 3 with both of the created matrices (now taking 
> the second row for sorting), repeat until all fragments consist of a 
> single column.
> 5. Compose the columns to a sorted matrix.
> 
> This algorithm has several problems:
> 
> 1. How to make a loop that is branching out in two subloops on each 
> iteration?
> 2. How to organize the intermediate results and compose them without 
> losing the order? Maybe save them in lists and sublists?
> 3. A fundamental problem: If there are rows with equal sums 
> the result 
> may depend on which of them is sorted after first. Maybe this 
> algorithm 
> won't work at all because of this problem?
> 
> Thanks in advance for your input,
> Michael
> 

Michael,

I see that you have received a number of responses to your request, and you
may have already solved your problem.  It seems to me that the responses so
far don't guarantee finding a match if both row and column exchanges are
necessary.  I put together some code as an R learning task for me.  It is
only partially automated.  Someone with more R programming skills than me
could probably totally automate this process.  I think your initial plan of
sorting by row total was moving in the right direction.  I took your
original matrix, m, and created a matrix,  x, as a random ordering of rows
and columns of m.  Then sorted m and x by both row and column totals. 

Next, all possible orderings of the ordered x matrix, x.o, were constructed
by permuting rows (and columns) with the same row ( or column) totals and
were compared to the ordered m matrix, m.o .   I used the permutations
function from the gtools package, so you would need to install and load that
package.  If a match was found, TRUE was printed out.  A better R programmer
than me could probably finish automating the entire process, wrapping it all
up in a nice function, but this at least gives you a framework for a
solution. 

require(gtools)

m <- matrix(c(
0, 1, 1, 1, 0, 1, 1, 0,
1, 1, 0, 0, 0, 1, 0, 1,
1, 0, 1, 0, 0, 0, 1, 1,
1, 1, 0, 0, 1, 0, 0, 0,
1, 0, 1, 1, 1, 0, 0, 0,
0, 1, 0, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1, 1), 7, 8, byrow = TRUE)

##----create comparison matrix by randomly ordering rows and columns (should
compare TRUE)----##
i <- sample(1:nrow(m))
j <- sample(1:ncol(m))
x <- m[i,j]

##----if row and column totals aren't the same, then matrices can't be
"equal"----##
#       no need to proceed if false
identical(sort(apply(m,1,sum)),sort(apply(x,1,sum))) &
identical(sort(apply(m,2,sum)),sort(apply(x,2,sum)))

##----Order both arrays by row and column totals----##
m.o <-  m[order(apply(m,1,sum)),order(apply(m,2,sum))]
x.o <-  x[order(apply(x,1,sum)),order(apply(x,2,sum))]

##----build row permutation list----##
#       these are groups of rows that have same sum
rowtot <- rle(apply(x.o,1,sum))$lengths
rb <- list()
begin <- 1
for(i in 1:length(rowtot)){
  rb[[i]] <- permutations(rowtot[i],rowtot[i],v=begin:cumsum(rowtot)[i]) 
  begin <- begin + rowtot[i]
  }
# construction of rperm could (should) be automated   
rperm <- cbind(rb[[1]][rep(1:nrow(rb[[1]]), each = nrow(rb[[2]])), ], 
               rb[[2]][rep(1:nrow(rb[[2]]), nrow(rb[[1]])), ] )
rperm <- cbind(rperm[rep(1:nrow(rperm), each = nrow(rb[[3]])), ], 
               rb[[3]][rep(1:nrow(rb[[3]]), nrow(rperm)), ] )

##----build column permutation list----##
#       these are groups of columns that have same sum
coltot <- rle(apply(x.o,2,sum))$lengths
cb <- list()
begin <- 1
for(i in 1:length(coltot)){
  cb[[i]] <- permutations(coltot[i],coltot[i],v=begin:cumsum(coltot)[i]) 
  begin <- begin + coltot[i]
  }
# construction of cperm could (should) be automated   
cperm <- cbind(cb[[1]][rep(1:nrow(cb[[1]]), each = nrow(cb[[2]])), ], 
               cb[[2]][rep(1:nrow(cb[[2]]), nrow(cb[[1]])), ] )

##----compare m.o with all x.o where rows and columns of x.o with tied
totals are permuted----##  
for(i in 1:nrow(rperm)){
  for(j in 1:nrow(cperm)){
    if(identical(m.o,x.o[rperm[i,],cperm[j,]])) {
      cat('TRUE','\n')
      break
      }
    }
}

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA




More information about the R-help mailing list