[R] Comparing matrices in R - matrixB %in% matrixA

John Fox jfox at mcmaster.ca
Fri Oct 31 14:35:06 CET 2014


Dear Charles,

How about the following?

----------- snip ---------

> AA <- as.list(as.data.frame(t(A)))
> BB <- as.list(as.data.frame(t(B)))
> which(AA %in% BB)
[1] 4 5

----------- snip ---------

This seems reasonably fast. For example:

----------- snip ---------

> A <- matrix(1:10000, 10000, 10)
> B <- A[1:1000, ]
> 
> system.time({
+   AA <- as.list(as.data.frame(t(A)))
+   BB <- as.list(as.data.frame(t(B)))
+   print(sum(AA %in% BB))
+ })
[1] 1000
   user  system elapsed 
   0.26    0.00    0.26 

----------- snip ---------

I hope this helps,
 John

------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
	
	
	

On Fri, 31 Oct 2014 14:20:38 +0100
 Charles Novaes de Santana <charles.santana at gmail.com> wrote:
> My apologies, because I sent the message before finishing it. i am very
> sorry about this. Please find below my message (I use to write the messages
> from the end to the beginning... sorry :)).
> 
> Dear all,
> 
> I am trying to compare two matrices, in order to find in which rows of a
> matrix A I can find the same values as in matrix B. I am trying to do it
> for matrices with around 2500 elements, but please find below a toy example:
> 
> A = matrix(1:10,nrow=5)
> B = A[-c(1,2,3),];
> 
> So
> > A
>      [,1] [,2]
> [1,]    1    6
> [2,]    2    7
> [3,]    3    8
> [4,]    4    9
> [5,]    5   10
> 
> and
> > B
>      [,1] [,2]
> [1,]    4    9
> [2,]    5   10
> 
> I would like to compare A and B in order to find in which rows of A I can
> find the  rows of B. Something similar to %in% with one dimensional arrays.
> In the example above, the answer should be 4 and 5.
> 
> I did a function to do it (see it below), it gives me the correct answer
> for this toy example, but the excess of for-loops makes it extremely slow
> for larger matrices. I was wondering if there is a better way to do this
> kind of comparison. Any idea? Sorry if it is a stupid question.
> 
> matbinmata<-function(B,A){
>     res<-c();
>     rowsB = length(B[,1]);
>     rowsA = length(A[,1]);
>     colsB = length(B[1,]);
>     colsA = length(A[1,]);
>     for (i in 1:rowsB){
>         for (j in 1:colsB){
>             for (k in 1:rowsA){
>                 for (l in 1:colsA){
>                     if(A[k,l]==B[i,j]){res<-c(res,k);}
>                 }
>             }
>         }
>     }
>     return(unique(sort(res)));
> }
> 
> 
> Best,
> 
> Charles
> 
> On Fri, Oct 31, 2014 at 2:12 PM, Charles Novaes de Santana <
> charles.santana at gmail.com> wrote:
> 
> > A = matrix(1:10,nrow=5)
> > B = A[-c(1,2,3),];
> >
> > So
> > > A
> >      [,1] [,2]
> > [1,]    1    6
> > [2,]    2    7
> > [3,]    3    8
> > [4,]    4    9
> > [5,]    5   10
> >
> > and
> > > B
> >      [,1] [,2]
> > [1,]    4    9
> > [2,]    5   10
> >
> > I would like to compare A and B in order to find in which rows of A I can
> > find the  rows of B. Something similar to %in% with one dimensional arrays.
> > In the example above, the answer should be 4 and 5.
> >
> > I did a function to do it (see it below), it gives me the correct answer
> > for this toy example, but the excess of for-loops makes it extremely slow
> > for larger matrices. I was wondering if there is a better way to do this
> > kind of comparison. Any idea? Sorry if it is a stupid question.
> >
> > matbinmata<-function(B,A){
> >     res<-c();
> >     rowsB = length(B[,1]);
> >     rowsA = length(A[,1]);
> >     colsB = length(B[1,]);
> >     colsA = length(A[1,]);
> >     for (i in 1:rowsB){
> >         for (j in 1:colsB){
> >             for (k in 1:rowsA){
> >                 for (l in 1:colsA){
> >                     if(A[k,l]==B[i,j]){res<-c(res,k);}
> >                 }
> >             }
> >         }
> >     }
> >     return(unique(sort(res)));
> > }
> >
> >
> > Best,
> >
> > Charles
> >
> >
> > --
> > Um axé! :)
> >
> > --
> > Charles Novaes de Santana, PhD
> > http://www.imedea.uib-csic.es/~charles
> >
> 
> 
> 
> -- 
> Um axé! :)
> 
> --
> Charles Novaes de Santana, PhD
> http://www.imedea.uib-csic.es/~charles
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list