[R] How to efficiently compare each row in a matrix with each row in another matrix?

arun smartpink111 at yahoo.com
Sat Dec 8 19:29:23 CET 2012


Hi,

May be this:
N <- 1000
M <- 5
P <- 5000
set.seed(15)
A <- matrix(runif(N,1,1000),nrow=N,ncol=M)
set.seed(425)
B <- matrix(runif(M,1,1000),nrow=P,ncol=M)

Marius.3.0<-function(A,B){do.call(cbind,lapply(split(B,row(B)),function(x) colSums(x>=t(A))==ncol(A)))}
 system.time(Marius.3.0(A,B))
  # user  system elapsed 
 # 0.524   0.000   0.523 

system.time(Marius.2.0(A,B))
#   user  system elapsed 
 # 0.972   0.236   1.212 

system.time(perhaps(A,B))
  # user  system elapsed 
  #1.232   0.244   1.482 

system.time(Marius(A,B))
#   user  system elapsed 
# 19.266   0.000  19.298 

With the toy example:
A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix
 B <- matrix(1:10, ncol=2) # (5, 2) matrix
 ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) 
ind
#      [,1]  [,2]  [,3]  [,4]  [,5]
#[1,]  TRUE  TRUE  TRUE  TRUE  TRUE
#[2,] FALSE FALSE  TRUE  TRUE  TRUE
#[3,] FALSE FALSE FALSE FALSE FALSE
 Marius.3.0(A,B)
#         1     2     3     4     5
#[1,]  TRUE  TRUE  TRUE  TRUE  TRUE
#[2,] FALSE FALSE  TRUE  TRUE  TRUE
#[3,] FALSE FALSE FALSE FALSE FALSE

 str(ind)
# logi [1:3, 1:5] TRUE FALSE FALSE TRUE FALSE FALSE ...
 str(Marius.3.0(A,B))
# logi [1:3, 1:5] TRUE FALSE FALSE TRUE FALSE FALSE ...
 #- attr(*, "dimnames")=List of 2
  #..$ : NULL
  #..$ : chr [1:5] "1" "2" "3" "4" ...
A.K.






----- Original Message -----
From: Marius Hofert <marius.hofert at math.ethz.ch>
To: R-help <r-help at r-project.org>
Cc: 
Sent: Saturday, December 8, 2012 6:28 AM
Subject: [R] How to efficiently compare each row in a matrix with each row in another matrix?

Dear expeRts,

I have two matrices A and B. They have the same number of columns but possibly different number of rows. I would like to compare each row of A with each row of B and check whether all entries in a row of A are less than or equal to all entries in a row of B. Here is a minimal working example:

A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix
B <- matrix(1:10, ncol=2) # (5, 2) matrix
( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # (3, 5) = (nrow(A), nrow(B)) matrix

The question is: How can this be implemented more efficiently in R, that is, in a faster way?

Thanks & cheers,

Marius

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list