[R] fast way to compare two matrices of combinations

Patrick Burns pburns at pburns.seanet.com
Thu Mar 13 17:37:16 CET 2008


One thing that will probably speed things enormously
is to not grow objects (all.triplets, etc.).  Instead create
them to be roughly the right size and do something like
double their size if they get full.

Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Mark W Kimpel wrote:

>I have a list (length 750), each element containing a vector of unique 
>strings (unique gene ids), with length up to ~40 (median 15). I want to 
>compile a matrix of all possible triplets and their frequency within 
>gene elements. Using combn and a lot of looping, I am accomplishing this 
>but it is VERY slow.
>
>I've tried to figure out a way to vectorize this, using "match" and 
>"%in%", but can't get my mind around it.
>
>Below is my code. sig.tf.pairs is the list. Suggestions?
>
>Mark
>
>
>############################################################
>M <- 3 # 3 for triplets, etc.
>##########################################################
># count all triplets
>all.triplets <- NULL
>all.count.vec <- NULL
>for (i in 1:length(sig.tf.pairs)){
>   if (length(sig.tf.pairs[[i]] >= M)){
>     triplets <- combn(sig.tf.pairs[[i]], M, simplify = TRUE)
>     for (j in 1:ncol(triplets)){
>       o <- order(triplets[,j])
>       triplets[,j] <- triplets[o,j]
>       count.vec <- rep(1, ncol(triplets))
>     }
>     if (is.null(all.count.vec)){
>       all.count.vec <- count.vec
>       all.triplets <- triplets
>     } else {
>       redundant.vec <- NULL
>       for (k in 1:ncol(all.triplets)){
>         for (m in 1:ncol(triplets)){
>           if (length(intersect(triplets[,m], all.triplets[,k] == M))){
>             all.count.vec[k] <- all.count.vec[k] + 1
>             redundant.vec <- c(redundant.vec, m)
>           }
>         }
>       }
>       if(!is.null(redundant.vec)){
>         triplets <- triplets[,-redundant.vec]
>         count.vec <- count.vec[,-redundant.vec]
>       }
>       all.triplets <- cbind(all.triplets, triplets)
>       all.count.vec <- c(all.count.vec, count.vec)
>     }
>   }
>}
>###################################
>
>  
>



More information about the R-help mailing list