[R] fast way to compare two matrices of combinations

Erik Iverson iverson at biostat.wisc.edu
Thu Mar 13 17:27:58 CET 2008


Hello Mark -

It may help if you provide a (small) set of example input and what you'd 
like as your output.

Best,
Erik Iverson

Mark W Kimpel wrote:
> I have a list (length 750), each element containing a vector of unique 
> strings (unique gene ids), with length up to ~40 (median 15). I want to 
> compile a matrix of all possible triplets and their frequency within 
> gene elements. Using combn and a lot of looping, I am accomplishing this 
> but it is VERY slow.
> 
> I've tried to figure out a way to vectorize this, using "match" and 
> "%in%", but can't get my mind around it.
> 
> Below is my code. sig.tf.pairs is the list. Suggestions?
> 
> Mark
> 
> 
> ############################################################
> M <- 3 # 3 for triplets, etc.
> ##########################################################
> # count all triplets
> all.triplets <- NULL
> all.count.vec <- NULL
> for (i in 1:length(sig.tf.pairs)){
>    if (length(sig.tf.pairs[[i]] >= M)){
>      triplets <- combn(sig.tf.pairs[[i]], M, simplify = TRUE)
>      for (j in 1:ncol(triplets)){
>        o <- order(triplets[,j])
>        triplets[,j] <- triplets[o,j]
>        count.vec <- rep(1, ncol(triplets))
>      }
>      if (is.null(all.count.vec)){
>        all.count.vec <- count.vec
>        all.triplets <- triplets
>      } else {
>        redundant.vec <- NULL
>        for (k in 1:ncol(all.triplets)){
>          for (m in 1:ncol(triplets)){
>            if (length(intersect(triplets[,m], all.triplets[,k] == M))){
>              all.count.vec[k] <- all.count.vec[k] + 1
>              redundant.vec <- c(redundant.vec, m)
>            }
>          }
>        }
>        if(!is.null(redundant.vec)){
>          triplets <- triplets[,-redundant.vec]
>          count.vec <- count.vec[,-redundant.vec]
>        }
>        all.triplets <- cbind(all.triplets, triplets)
>        all.count.vec <- c(all.count.vec, count.vec)
>      }
>    }
> }
> ###################################
>



More information about the R-help mailing list