[R] an efficient way to calculate correlation matrix

Dennis Murphy djmuser at gmail.com
Thu Jun 2 18:27:00 CEST 2011


?cor

Example:

> dd <- data.frame(x1 = rnorm(40), x2 = rnorm(40), x3 = runif(40, 0, 10))
'data.frame':   40 obs. of  3 variables:
 $ x1: num  -0.5585 1.3831 -1.7862 0.0572 0.2825 ...
 $ x2: num  -0.5247 -0.8636 -0.0749 0.2399 -0.1592 ...
 $ x3: num  7.698 5.259 0.918 3.251 5.169 ...
> cor(dd)
           x1          x2          x3
x1  1.0000000 -0.23268659 -0.02915700
x2 -0.2326866  1.00000000 -0.07073142
x3 -0.0291570 -0.07073142  1.00000000

It will also run on a matrix of numeric variables. Any factor or
character variables in the set of variables shipped to cor() will
cause an error; for example,

> head(Oats, 3)
Grouped Data: yield ~ nitro | Block
  Block     Variety nitro yield
1     I     Victory   0.0   111
2     I     Victory   0.2   130
3     I     Victory   0.4   157
> cor(Oats)
Error in cor(Oats) : 'x' must be numeric
> cor(Oats[, 3:4])
          nitro     yield
nitro 1.0000000 0.6130266
yield 0.6130266 1.0000000

HTH,
Dennis

On Thu, Jun 2, 2011 at 8:48 AM, Bill Hyman <billhyman1 at yahoo.com> wrote:
> Dear all,
>
> I have a problem. I have m variables each of which has n observations. I want to
> calculate pairwise correlation among the m variables and store the values in a m
> x m matrix. It is extremely slow to use nested 'for' loops if m and n are large.
> Is there any efficient alternative to do this? Many thanks for your
> suggestions!!
>
> Bill
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list