[R] how to replace my double for loop which is little efficient!

Berend Hasselman bhh at xs4all.nl
Sun Dec 26 15:13:27 CET 2010



bbslover wrote:
> 
> x: is a matrix  202*263,  that is 202 samples, and 263 independent
> variables
> 
> num.compd<-nrow(x); # number of compounds
> diss.all<-0
> for( i in 1:num.compd)
>    for (j in 1:num.compd)
>       if (i!=j) {
>         S1<-sum(x[i,]*x[j,])
>         S2<-sum(x[i,]^2)
>         S3<-sum(x[j,]^2)
>         sim2<-S1/(S2+S3-S1)
>         diss2<-1-sim2
>         diss.all<-diss.all+diss2}
> 
> it will cost a long time to finish this computation! i really need "rapid"
> code to replace my code.
> 

Alternative 1:  j-loop only needs to start at i+1 so

for( i in 1:num.compd) {
    for (j in seq(from=i+1,to=num.compd,length.out=max(0,num.compd-i))) {
            S1<-sum(x[i,]*x[j,])
            S2<-sum(x[i,]^2)
            S3<-sum(x[j,]^2)
            sim2<-S1/(S2+S3-S1)
            diss2<-1-sim2
            diss2.all<-diss2.all+diss2
    }
}
diss2.all <- 2 * diss2.all

On my pc this is about twice as fast as your version (with 202 samples and
263 variables)

Alternative 2: all sum() are not necessary. Use some matrix algebra:

xtx <- x %*% t(x)
diss3.all <- 0
for( i in 1:num.compd) {
    for (j in seq(from=i+1,to=num.compd,length.out=max(0,num.compd-i))) {
            S1 <- xtx[i,j]
            S2 <- xtx[i,i]
            S3 <- xtx[j,j]
            sim2<-S1/(S2+S3-S1)
            diss2<-1-sim2
            diss3.all<-diss3.all+diss2
    }
}
diss3.all <- 2 * diss3.all

This is about four times as fast as alternative 1.

I'm quite sure that more expert R gurus can get some more speed up.

Note: I generated the x matrix with:
set.seed(1);x<-matrix(runif(202*263),nrow=202)
(Timings on iMac 2.16Ghz and using 64-bit R)

Berend

-- 
View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164262.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list