# [R] Intersection of 2 matrices

Hans W Borchers hwborchers at googlemail.com
Fri Dec 2 20:22:31 CET 2011

```Michael Kao <mkao006rmail <at> gmail.com> writes:

>
Your solution is fast, but not completely correct, because you are also
counting possible duplicates within the second matrix. The 'refitted'
function could look as follows:

compMat2 <- function(A, B) {  # rows of B present in A
B0 <- B[!duplicated(B), ]
na <- nrow(A); nb <- nrow(B0)
AB <- rbind(A, B0)
ab <- duplicated(AB)[(na+1):(na+nb)]
return(sum(ab))
}

and testing an example the size the OR was asking for:

set.seed(8237)
A  <- matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2)
B  <- matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2)

system.time(n <- compMat2(A, B))  # n = 3790

while compMat() will return 5522 rows, with 1732 duplicates within B !
A 3.06 GHz iMac needs about 2 -- 2.5 seconds.

Hans Werner

> On 2/12/2011 2:48 p.m., David Winsemius wrote:
> >
> > On Dec 2, 2011, at 4:20 AM, oluwole oyebamiji wrote:
> >
> >> Hi all,
> >>     I have matrix A of 67420 by 2 and another matrix B of 59199 by 2.
> >> I would like to find the number of rows of matrix B that I can find
> >> in matrix A (rows that are common to both matrices with or without
> >> sorting).
> >>
> >> I have tried the "intersection" and "is.element" functions in R but
> >> it only working for the vectors and not matrix
> >> i.e,    intersection(A,B) and is.element(A,B).
> >
> > Have you considered the 'duplicated' function?
> >
>
> Here is an example based on the duplicated function
>
> test.mat1 <- matrix(1:20, nc = 5)
>
> test.mat2 <- rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5))
>
> compMat <- function(mat1, mat2){
>      nr1 <- nrow(mat1)
>      nr2 <- nrow(mat2)
>      mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]
> }
>
> compMat(test.mat1, test.mat2)
>
>

```