[R] Correlation between 2 matrices but with subset of variables

Petr Savicky savicky at cs.cas.cz
Fri Mar 9 08:48:01 CET 2012


On Thu, Mar 08, 2012 at 03:57:06PM -0800, A Ezhil wrote:
> Dear All,
> I have two matrices A (40 x 732) and B (40 x 1230) and would like to calculate correlation between them. ?I can use: cor(A,B, method="pearson") to calculate correlation between all possible pairs. But the issue is that there is one-many specific mappings between A and B and I just need to calculate correlations for those pairs (not all). Some variables in A (proteins, say p1) have more than 3 (or 2 or 1) corresponding mapping in B (mRNA, say, m1,m2,m3) and I would like calculate correlations between p1-m1, p1-m2, and p1-m3 and then for the second variable p2 etc.?
> I have the mapping information in another file (annotation file). Could you please suggest me how to do that?

Hi.

Try the following.

1. Create some simple data

  X <- matrix(rnorm(15), nrow=5, ncol=3)
  Y <- matrix(rnorm(25), nrow=5, ncol=5)

2. Choose a table of pairs of columns, for which the correlation
   should be computed, and expand the matrices.

  ind <- rbind(
    c(1, 1),
    c(1, 2),
    c(2, 2),
    c(3, 3),
    c(3, 4),
    c(3, 5))

  X1 <- X[, ind[, 1]]
  Y1 <- Y[, ind[, 2]]

3. Compute the correlations between X1[, i] and Y1[, i] and
   compare to the diagonal of cor(X1, Y1)

  parallel.cor <- function(X, Y)
  {
      X <- sweep(X, 2, colMeans(X))
      Y <- sweep(Y, 2, colMeans(Y))
      colSums(X*Y)/sqrt(colSums(X^2)*colSums(Y^2))
  }

  out <- parallel.cor(X1, Y1)
  verif <- diag(cor(X1, Y1))
  all.equal(out, verif)

  [1] TRUE

Hope this helps.

Petr Savicky.



More information about the R-help mailing list