[R] Bug in Kendall for n<4?
maechler at stat.math.ethz.ch
Mon Nov 24 11:24:18 CET 2008
thanks a lot for your clarifications.
>>>>> "AIM" == A I McLeod <aim at stats.uwo.ca>
>>>>> on Sat, 22 Nov 2008 22:24:11 -0500 (EST) writes:
AIM> The package Kendall computes the p-value when there are
AIM> ties in one ranking. This often happens with trend
AIM> testing with environmental data. I get about 5-10
AIM> emails per year from scientists using Kendall for that
AIM> purpose who don't know how to use R very well. I
AIM> suspect this means there are many users of this
Indeed, the case of ties in the data is an important one in
possibly many applications, and indeed, cor.test() is
and hence the Kendall package is
serving an important need!
I do apologize for my impolite wording to which I was lead by
the example (and 'Subject').
If the topic is just *computation* of Kendall's tau, I don't
think anyone should use the Kendall package.
If, however, one is interested in P-values of (H0: tau = 0),
your Kendall package is indeed a valuable asset!
AIM> Thank you though for your comments. So I will improve
AIM> the documentation for Kendall by terminating the
AIM> program with an error message when n<=3 (this case is
AIM> of no interest to me) and warning message when n<12
AIM> that the p-values may be inaccurate. My student Paul
AIM> Valz in this Ph.D. thesis discussed an enumeration
AIM> algorithm for the exact p-value computation for any n
AIM> with arbitrary ties in both variables -- but the
AIM> algorithm is complex and for practical purposes, I
AIM> prefer to use the algorithm in Kendall -- especially
AIM> for trend testing with block bootstrap. That is the
AIM> reason for the existence of this package.
AIM> Valz's algorithm was published in JCGS but I am believe
AIM> there is a mistake, so I don't use it. The approximate
AIM> algorithm, for p-values, that is used in Kendall, has
AIM> been extensively tested.
AIM> Also, I doubt if the current p-values from cor.test are
AIM> correct for small n and I notice that ties in one
AIM> ranking do produce a warning.
That's an interesting point about which I think we should
exchange more, but really in a different thread, possibly on
R-devel rather than R-help.
Thanking you and apologizing once more:
Martin Maechler, ETH Zurich
AIM> Finally, I will also make more clear in the
AIM> documentation about cor and cor.test being alternative
AIM> functions which may be more appropriate for some users.
AIM> Ian McLeod
>> On Sat, Nov 22, 2008 at 9:04 AM, Martin Maechler
>> <maechler at stat.math.ethz.ch> wrote:
SM> I believe Kendall tau is well-defined for this case...
>>> The real question is *WHY* there needs to be a separate
>>> package 'Kendall' when R itself does everything you want
>>> and does not show any problems?
>> Thanks for pointing me to cor(...,method="kendall"),
>> which I did not know about; I used the Kendall CRAN
>> package out of pure ignorance.
>> In my defense, I think it is excusable ignorance, as
>> Search on the R Project home page finds the Kendall
>> package (which only mentions cor as a "See Also"). I
>> only more recently discovered the advantages of
>> By the way, is Kendall well-defined when the arguments
>> are not permutations of each other? cor seems to return
>> results even in this case:
>> a<-factor(c("Alice","Bob","Chris")) b<-a[1:2] c<-a[2:3]
>> cor(a,b,method="kendall") => 1
>> apparently interpreting b as c(1,2) and c as c(1,2) based
>> on alphabetical order (even though it is an UNordered
>> factor), which seems to make the value depend on the
>> subjects' names, which I'd think was wrong for a
>> rank-order statistic.
>> Thanks again,
More information about the R-help