[R] how to replace my double for loop which is little efficient!

Berend Hasselman bhh at xs4all.nl
Mon Dec 27 08:39:46 CET 2010



djmuseR wrote:
> 
> On Sun, Dec 26, 2010 at 4:18 AM, bbslover <dluthm at yeah.net> wrote:
> 
>>
>> x: is a matrix  202*263,  that is 202 samples, and 263 independent
>> variables
>>
>> num.compd<-nrow(x); # number of compounds
>> diss.all<-0
>> for( i in 1:num.compd)
>>   for (j in 1:num.compd)
>>      if (i!=j) {
>>
> 
> Isn't this just X'X?
> 
>>        S1<-sum(x[i,]*x[j,])
>>
> Aren't each of S2 and S3 just diag(X'X)?
> 
>>        S2<-sum(x[i,]^2)
>>
>        S3<-sum(x[j,]^2)
>>        sim2<-S1/(S2+S3-S1)
>>        diss2<-1-sim2
>>        diss.all<-diss.all+diss2}
>>
> 
> I tried
> s1 <- crossprod(x)
> s2 <- diag(s1)
> s3 <-outer(s2, s2, '+') - s1
> s1/s3
> 
> This yields a symmetric matrix with 1's along the diagonal and quantities
> between 0 and 1 in the off-diagonal. Something like it could conceivably
> be
> used as a similarity matrix. Is that what you're looking for with sim2?
> 
> I agree with Berend: it looks like a problem that could be easily solved
> with some matrix algebra. R can do matrix algebra quite efficiently,
> y'know...
> 
> (BTW, I tried this on a 1000 x 1000 input matrix:
> system.time(myfunc(x))
>    user  system elapsed
>    0.99    0.02    1.02
> 
> I expect it could be improved by an order of magnitude if one actually
> knew
> what you were computing... )
> 

I did some more work along Dennis' lines

xtx <- tcrossprod(x)
xtd <- diag(xtx)
xzz <- outer(xtd,xtd,'+')
zz  <- 1 - xtx/(xzz-xtx)
diss.all <- sum(zz)

this appears to give the desired result and it's quite a bit faster than my
alternative 2.
It would indeed be nice to know what is being computed.

Berend
-- 
View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164755.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list