[R] generating random covariance matrices (with a uniform distribution of correlations)

Ned Dochtermann ned.dochtermann at gmail.com
Mon Jun 6 21:18:37 CEST 2011


Thank you very much, this does help quite a bit.
Ned


From: Petr Savicky <savicky_at_praha1.ff.cuni.cz>
Date: Sat, 04 Jun 2011 11:44:52 +0200

On Fri, Jun 03, 2011 at 01:54:33PM -0700, Ned Dochtermann wrote:
> Petr,
> This is the code I used for your suggestion:
>
> k<-6;kk<-(k*(k-1))/2
> x<-matrix(0,5000,kk)
> for(i in 1:5000){
> A.1<-matrix(0,k,k)
> rs<-runif(kk,min=-1,max=1)
> A.1[lower.tri(A.1)]<-rs
> A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)]
> cors.i<-diag(k)
> t<-.001-min(Re(eigen(A.1)$values))
> new.cor<-cov2cor(A.1+(t*cors.i))
> x[i,]<-new.cor[lower.tri(new.cor)]}
> hist(c(x)); max(c(x)); median(c(x))
>
> This, unfortunately, does not maintain the desired distribution of
> correlations.

Hello.

On the contrary to what i thought originally, there are solutions also for
the case of the correlation matrix. The first solution creates a singular
correlation matrix (of rank 3), but the nondiagonal entries have exactly the
uniform distribution on [-1, 1], since the scalar product of two independent
uniformly distributed unit vectors in R^3 has the uniform distribution on
[-1, 1].

  x <- matrix(rnorm(18), nrow=6, ncol=3)
  x <- x/sqrt(rowSums(x^2))
  a <- x %*% t(x)


The next solution produces a correlation matrix of full rank, whose
non-diagonal entries have distribution very close to the uniform on [-1, 1].
KS test finds a difference only with sample size more than 50'000.

  w <- c(0.01459422, 0.01830718, 0.04066405, 0.50148488, 0.60330865,
0.61832829)
  x <- matrix(rnorm(36), nrow=6, ncol=6) %*% diag(w)
  x <- x/sqrt(rowSums(x^2))
  a <- x %*% t(x)


Hope this helps.

Petr Savicky.

-----Original Message-----
From: Ned Dochtermann [mailto:ned.dochtermann at gmail.com] 
Sent: Friday, June 03, 2011 1:55 PM
To: 'r-help at r-project.org'; 'savicky at praha1.fff.cuni.cz'
Subject: Re: [R] generating random covariance matrices (with a uniform
distribution of correlations)

Petr,
This is the code I used for your suggestion:

	k<-6;kk<-(k*(k-1))/2
	x<-matrix(0,5000,kk)
	for(i in 1:5000){
	A.1<-matrix(0,k,k)
	rs<-runif(kk,min=-1,max=1)
	A.1[lower.tri(A.1)]<-rs
	A.1[upper.tri(A.1)]<-t(A.1)[upper.tri(A.1)]
	cors.i<-diag(k)
	t<-.001-min(Re(eigen(A.1)$values))
	new.cor<-cov2cor(A.1+(t*cors.i))
	x[i,]<-new.cor[lower.tri(new.cor)]}
	hist(c(x)); max(c(x)); median(c(x))

This, unfortunately, does not maintain the desired distribution of
correlations.
I did, however, learn some neat coding tricks (that were new for me) along
the way.

Ned
--
On Thu, Jun 02, 2011 at 04:42:59PM -0700, Ned Dochtermann wrote:
> List members,
> 
> Via searches I've seen similar discussion of this topic but have not seen
> resolution of the particular issue I am experiencing. If my search on this
> topic failed, I apologize for the redundancy. I am attempting to generate
> random covariance matrices but would like the corresponding correlations
to
> be uniformly distributed between -1 and 1. 
> 
...
> 
> Any recommendations on how to generate the desired covariance matrices
would
> be appreciated.

Hello.

Let me suggest the following procedure.

1. Generate a symmetric matrix A with the desired distribution of the
   non-diagonal elements and with zeros on the diagonal.
2. Compute the smallest eigenvalue lambda_1 of A.
3. Replace A by A + t I, where I is the identity matrix and t is a
   number such that t + lambda_1 > 0.

The resulting matrix will have the same non-diagonal elements as A,
but will be positive definite.

Petr Savicky.



More information about the R-help mailing list