[R] Problem when creating matrix of values based on covariance matrix

R. Michael Weylandt michael.weylandt at gmail.com
Mon Aug 13 02:52:42 CEST 2012


On Sun, Aug 12, 2012 at 1:46 PM, Boel Brynedal <brynedal at gmail.com> wrote:
> A clarification - yes, calculating the pearson covariance does give
> the expected results. I dont fully understand why yet, but many thanks
> for this help!

I'm not sure that the spearman correlation is an appropriate estimator
for the covariance matrix of a multivariate normal, which is defined
in terms of the pearson correlation matrix. (More bluntly, pearson and
spearman are different measures and one won't converge to the other)

A quick unscientific test:

#################

set.seed(1)
covMat <- matrix(c(1, 0.4875, 0.4875, 1), 2, 2) # Arbitrary
library(MASS)
library(TTR) # For fast pearson cor

n <- 5000
rands <- mvrnorm(n, c(0,0), covMat, empirical = TRUE) # I'm pretty
sure we want empirical = TRUE

runPearson <- runCor(rands[,1], rands[,2], cumulative = TRUE)

# This takes a little while but I'm doing my best to make it fast ;-)
runSpearman <- vapply(seq(10, n), function(n) cor(rands[seq_len(n), ],
method =  "spearman")[2], numeric(1))

plot(runPearson, type = "l")
lines(runSpearman, col = 2)

# Show that we get good/decent convergence
abline(h = covMat[2], col = 3)

##############

That long stable difference suggests to me that you don't want to use
cor(,,"spearman") to estimate a quantity defined in terms of cor(,,
"pearson").

I am not sure if this is a general/fixed bias in the spearman
estimator or if it's just a function of the covMat I randomly chose.
Prof. Dalgaard and many others on this list must know.

Cheers,
Michael



More information about the R-help mailing list