[R] R/S-Plus/SAS yield different results for Kendall-tau and Spearman nonparametric regression

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Aug 15 05:18:20 CEST 2005


Dennis Fisher <fisher at plessthan.com> writes:

> Colleagues,
> I ran some nonparametric regressions in R (run in RedHat Linux), then  
> a colleague repeated the analyses in SAS.  When we obtained different  
> results, I tested S-Plus (same Linux box).  And, got yet different  
> results.  I replicated the results with a small dataset:
> 
> DATA:
 (They came across somewhat garbled, but we'll believe you...)

...

> Each of the programs yields some differences, possibly because of how  
> ties are handled (R warns about this).  Can anyone enlighten me?

Ties are certainly involved in the Spearman case. There are more
accurate expressions for the variance of the test statistic in the
tied case, than the formula that R is using. As you see, the
difference is not exactly huge (at least for a small number of ties),
but it is something that we should get around to fixing.

I assume that there is a similar issue with Kendall's tau. In
addition, S-PLUS appears to modify the actual definition of the test
statistic, which might be a matter of taste. (K's tau relies
on counting concordant and discordant pairs relative to the total
number of pairs, and with ties, some pairs will be undecided. You can
either discard such pairs or count them as zeros. S-PLUS appears to be
doing the latter. A quick test is to notice that  x <- y <- rep(0:1,4)
gives a tau that is less than 1 in S-PLUS but gives 1 in R.)

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907




More information about the R-help mailing list