[Rd] Spearman's rank correlation test

Thu Feb 12 13:33:33 CET 2009

Hi All:

help(cor.test) claims
  For Spearman's test, p-values are computed using algorithm AS 89.

Algorithm AS 89 was introduced by the paper
  D. J. Best & D. E. Roberts (1975), Algorithm AS 89: The Upper Tail
  Probabilities of Spearman's rho. Applied Statistics, Vol. 24, No. 3, 377-379.
Table 1(a) in this paper presents maximum absolute error |\Delta_m|, of the
approximation for all possible values of the statistic S for samples sizes
n = 7, 9, 11, 13. The presented errors are

   n  |\Delta_m|

   7  0.0046
   9  0.0011
  11  0.0006
  13  0.0005

Due to the problem explained in detail including a patch at
  https://stat.ethz.ch/pipermail/r-devel/2009-January/051936.html
the error of R implementation of Spearman's rank correlation test is larger
than the above bounds for the sample size n = 11 and some of the values of S,
which correspond to positive correlation.

For example, for n = 11 and S = 90, we have
  x <- 1:11
  y <- c(6:1, 7, 11:8)
  out <- cor.test(x, y, method="spearman", alternative="greater")
  out$statistic # 90
  out$p.value   # 0.02921104
while the correct p-value is 0.03044548, so the absolute difference
is 0.00123444. This is larger than the absolute error 0.0006 guaranteed
for AS 89. In my opinion, this means that the claim from help(cor.test)
cited above is not correct.

To see the error of AS 89 in the example above, one can use
  cor.test(x, -y, method="spearman", alternative="less")$p.value # 0.03036413
since on the side of negative correlation, R calls AS 89 correctly.
So, for the x, y above, correctly called AS 89 has absolute error 0.00008135.

There is a package pspearman currently included to CRAN, which provides a
correction of the problem without the need to modify R base.

Petr.