[R] Correlate rows of 2 matrices

arun smartpink111 at yahoo.com
Mon Sep 23 23:55:24 CEST 2013


Hi Ira,

I tried the ?lapply().  Looks like it edges the ?for() loop.
For e.g.
 

set.seed(435)
m1 <- matrix(rnorm(2000*30), ncol=30)
m2 <-  matrix(rnorm(2000*30), ncol= 30)
 corsP<-vector()
  
 system.time({for(i in 1:2000) corsP[i] =  cor(m1[i,], m2[i,])})
 # user  system elapsed 
 # 0.124   0.000   0.122 
system.time({corsP2<- unlist(lapply(1:2000,function(i) cor(m1[i,],m2[i,])))})
# user  system elapsed 
# 0.108   0.000   0.110 
identical(corsP,corsP2)
#[1] TRUE


system.time(corsP3<- diag(cor(t(m1),t(m2))))
#  user  system elapsed 
#  0.272   0.004   0.276 



mNew<- rbind(m1,m2)
 indx<-rep(seq(nrow(mNew)/2),2)
system.time({corsP4<- tapply(seq_along(indx),list(indx),FUN=function(x) cor(t(mNew[x,]),t(mNew[x,]))[2])})
#   user  system elapsed 
#  0.156   0.000   0.160 
attr(corsP4,"dimnames")<- NULL
all.equal(corsP,as.vector(corsP4))
#[1] TRUE


A.K.


________________________________
From: Ira Sharenow <irasharenow100 at yahoo.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, September 23, 2013 5:45 PM
Subject: Re: Correlate rows of 2 matrices



Arun,

What department are you in? Are you on LinkedIn?

The loop takes about a second. I do not know how to use lapply/sapply with more than one object and a function of two variables such as cor().

When there are 2,000 columns it cannot be right to compute 4,000,000 correlations in order to use the 2,000 that are along the diagonal.

Ira 
On 9/23/2013 2:12 PM, arun wrote:

Ira, I work as a postdoc at Wayne State Univ. in Detroit. I didn't check the speed of ?diag().  It could be a bit slower because it first computes the whole correlation and then take the diagonal elements.  In that respect, loop will save the time.  Would be worth checking whether ?lapply() improves the speed compared to ?for(). Arun   ________________________________
From: Ira Sharenow <irasharenow100 at yahoo.com> To: arun <smartpink111 at yahoo.com> Sent: Monday, September 23, 2013 4:42 PM
Subject: Re: Correlate rows of 2 matrices Arun, On a contract, I work for this San Francisco firm. But I work from home. http://www.manifoldpartners.com/Home.html How about yourself? Where are you located? Incidentally for my large matrix in addition to computing the pearson correlation matrix with use = "pairwise.complete.obs" (85 seconds), I also have to do spearman calculations. The code ran for 27 minutes. I only need about 2000 correlations, but I am computing 2000* 2000 correlations. Using a loop reduced the time to about 1 second Please note that this initial data set is one of the smaller ones I will be working on. Ira 
On 9/23/2013 11:54 AM, arun wrote: Hi Ira,
Glad it worked for you. I would also choose the one you selected.  
BTW, where do you work?
Regards,
Arun ________________________________
From: Ira Sharenow <irasharenow100 at yahoo.com> To: arun <smartpink111 at yahoo.com> Sent: Monday, September 23, 2013 2:47 PM
Subject: Re: Correlate rows of 2 matrices Arun, Thanks for your help. I am very impressed with your ability to string together functions in order to achieve a desired result. On the other hand I prefer simplicity and I will have to explain my code to my boss who might have to eventually modify my code after I’ve moved on. I decided to go with your first option. It worked quite well.
diag(cor(t(m1),t(m2))) Thanks again. Ira 
On 9/22/2013 6:57 PM, Ira Sharenow wrote: Arun, 
>  
>I have a new problem for you.  I have two data frames (or matrices) and row by row I want to take the correlations. So if I have a 3 row by 10 column matrix, I would produce 3 correlations. Is there a way to merge the matrices and then use some sort of split? Ideas/solutions much appreciated. m1 = matrix(rnorm(30), nrow = 3)
m2 = matrix(rnorm(30), nrow = 3) 
>set.seed(22) 
>m1 = matrix(rnorm(30), nrow = 3)
m2 = matrix(rnorm(30), nrow = 3)
for(i in 1:3) corsP[i] =  cor(m1[i,], m2[i,])
corsP 
>[1] -0.50865019 -0.27760046  0.01423144 
>Thanks. Ira                         



More information about the R-help mailing list