[R] spearman rank correlation problem

William T Morgan wmorgan at mitre.org
Mon Mar 15 22:37:08 CET 2004


Hello R gurus,

I want to calculate the Spearman rho between two ranked lists. I am
getting results with cor.test that differ in comparison to my own
spearman function:

  > my.spearman
  function(l1, l2) {
    if(length(l1) != length(l2)) stop("lists must have same length")
    r1 <- rank(l1)
    r2 <- rank(l2)
    dsq <- sapply(r1-r2,function(x) x^2)
    1 - ((6 * sum(dsq)) / (length(l1) * (length(l1)^2 - 1)))
  }

Perhaps I'm doing something wrong in that code, but it's a pretty
straightforward calculation, so it's hard to see what, especially with
rank() handling the ties correctly. One example difference:

  > a
   [1]  0.112761940  0.130260949 -0.010567817 -0.411906701  0.004588443
   [6] -0.034337846 -0.148082981 -0.243724351  0.186690390  0.408983820
  > b
   [1]  8 13 14 15  5  7  8  2 19 19
  > cor.test(a,b,method="spearman")

  	Spearman's rank correlation rho

  data:  a and b 
  S = 85, p-value = 0.1544
  alternative hypothesis: true rho is not equal to 0 
  sample estimates:
        rho 
  0.4878139 

  Warning message: 
  p-values may be incorrect due to ties in: cor.test.default(a, b, method = "spearman") 
  > my.spearman(a,b)
  [1] 0.4909091

Which, as you can see, isn't quite the same. And also:

  > c
   [1] 0 0 0 0 0 0 0 0 0 0
  > cor.test(a,c,method="spearman")

  	Spearman's rank correlation rho

  data:  a and c 
  S = NA, p-value = NA
  alternative hypothesis: true rho is not equal to 0 
  sample estimates:
  rho 
   NA 

  Warning message: 
  The standard deviation is zero in: cor(x, y, na.method, method == "kendall") 
  > my.spearman(a,c)
  [1] 0.5

Any suggestions as to what I'm doing wrong?

Thanks in advance,

-- 
William Morgan
wmorgan at mitre dot org




More information about the R-help mailing list