[R] extracting pairs from correlation matrix and p-value matrix

Amit amitkumartiwary at gmail.com
Fri Apr 30 16:14:27 CEST 2010


Dear All,

I am working on a large matrix of dimension 20000x700 say 'mat'. I
have calculated pearson correlation for the rows of the matrix and
their p-values using rcorr function in library Hmisc. Now I wish to
filter out those pairs who's PCC value is above 0.8 cut off and
p-value is less than 0.05.

>library(Hmisc)
>mat_cor=rcorr(t(mat),type="pearson")
>head(mat_cor)
              aaeA_b3241_14 aaeB_b3240_15 aaeR_b3243_15 aaeX_b3242_12
aas_b2836_14 aat_b0885_14 abgA_b1338_14 abgB_b1337_15 abgR_b1339_15
abgT_b1336_15
aaeA_b3241_14          1.00          0.12          0.64          0.21
       0.10        -0.68         -0.61         -0.62         -0.66
    -0.67
aaeB_b3240_15          0.12          1.00         -0.57          0.26
      -0.50         0.05          0.50          0.46          0.40
     0.47
aaeR_b3243_15          0.64         -0.57          1.00         -0.02
       0.45        -0.52         -0.93         -0.92         -0.91
    -0.96
aaeX_b3242_12          0.21          0.26         -0.02          1.00
      -0.45        -0.60         -0.20         -0.16         -0.22
    -0.07
aas_b2836_14           0.10         -0.50          0.45         -0.45
       1.00        -0.08         -0.53         -0.57         -0.52
    -0.58
aat_b0885_14          -0.68          0.05         -0.52         -0.60
      -0.08         1.00          0.63          0.61          0.64
     0.59
abgA_b1338_14         -0.61          0.50         -0.93         -0.20
      -0.53         0.63          1.00          0.99          0.99
     0.98
abgB_b1337_15         -0.62          0.46         -0.92         -0.16
      -0.57         0.61          0.99          1.00          1.00
     0.99
abgR_b1339_15         -0.66          0.40         -0.91         -0.22
      -0.52         0.64          0.99          1.00          1.00
     0.98
abgT_b1336_15         -0.67          0.47         -0.96         -0.07
      -0.58         0.59          0.98          0.99          0.98
     1.00

n= 20000


P
              aaeA_b3241_14 aaeB_b3240_15 aaeR_b3243_15 aaeX_b3242_12
aas_b2836_14 aat_b0885_14 abgA_b1338_14 abgB_b1337_15 abgR_b1339_15
abgT_b1336_15
aaeA_b3241_14               0.7401        0.0457        0.5556
0.7774       0.0289       0.0629        0.0561        0.0371
0.0351
aaeB_b3240_15 0.7401                      0.0868        0.4644
0.1435       0.8840       0.1375        0.1799        0.2522
0.1747
aaeR_b3243_15 0.0457        0.0868                      0.9529
0.1915       0.1198       0.0000        0.0001        0.0002
0.0000
aaeX_b3242_12 0.5556        0.4644        0.9529
0.1905       0.0670       0.5813        0.6610        0.5441
0.8422
aas_b2836_14  0.7774        0.1435        0.1915        0.1905
            0.8295       0.1167        0.0851        0.1224
0.0784
aat_b0885_14  0.0289        0.8840        0.1198        0.0670
0.8295                    0.0521        0.0626        0.0449
0.0740
abgA_b1338_14 0.0629        0.1375        0.0000        0.5813
0.1167       0.0521                     0.0000        0.0000
0.0000
abgB_b1337_15 0.0561        0.1799        0.0001        0.6610
0.0851       0.0626       0.0000                      0.0000
0.0000
abgR_b1339_15 0.0371        0.2522        0.0002        0.5441
0.1224       0.0449       0.0000        0.0000
0.0000
abgT_b1336_15 0.0351        0.1747        0.0000        0.8422
0.0784       0.0740       0.0000        0.0000        0.0000

Now from mat_cor$r and mat_cor$P I wish to extract those pairs which
satisfies the condition PCC>0.8 and p-value<0.05. But I am puzzled!!
Please help.

regards
Amit



More information about the R-help mailing list