[Rd] Inconsistencies in wilcox.test

Karolis Koncevičius k@ro||@@koncev|c|u@ @end|ng |rom gm@||@com
Mon Dec 9 22:43:36 CET 2019


So I tried adding Infinity support for all cases.
And it is (as could be expected) more complicated than I thought.

It is easy to add Inf support for the test. The problems start with conf.int=TRUE.

Currently confidence intervals are computed via `uniroot()` and, in the 
case of infinities, we are computationally looking for roots over 
infinite interval which results in an error. I suspect this is the 
reason Inf values were removed in the first place.

Just a note, I found a few more errors/inconsistencies when requesting 
confidence intervals with paired=TRUE (due to Infinities being left in).

Current error in Inf-Inf scenario:

wilcox.test(c(1,2,Inf), c(4,8,Inf), paired=TRUE, conf.int=TRUE)
   Error in if (ZEROES) x <- x[x != 0] :
     missing value where TRUE/FALSE needed



NaN confidence intervals:

wilcox.test(c(1:9,Inf), c(21:28,Inf,30), paired=TRUE, conf.int=TRUE)

   Wilcoxon signed rank test with continuity correction

   data:  c(1:9, Inf) and c(21:28, Inf, 30)
   V = 9.5, p-value = 0.0586
   alternative hypothesis: true location shift is not equal to 0
   0 percent confidence interval:
    NaN NaN
    sample estimates:
    midrange
         NaN


The easiest "fix" for consistency would be to simply remove Infinity 
support from the paired=TRUE case.

But going with the more desirable approach of adding Infinity support 
for non-paired cases - it is currently not clear to me what confidence 
intervals and pseudomedian should be. Specially when Infinities are on 
both sides.

Regards,
Karolis Koncevičius.


On 2019-12-07 23:18, Karolis Koncevičius wrote:
>Thank you for a fast response. Nice to see this mailing list being so 
>alive.
>
>Regarding Inf issue: I agree with your assessment that Inf should not 
>be removed. The code gave me an impression that Inf values were 
>intentionally removed (since is.finite() was used everywhere, except 
>for paired case). I will try to adjust my patch according to your 
>feedback.
>
>One more thing: it seems like you assumed that issues 2:4 are all 
>related to machine precision, which is not the case - only 2nd issue 
>is.
>Just wanted to draw this to your attention, in case you might have 
>some feedback and guidelines about issues 3 and 4 as well.
>
>
>
>On 2019-12-07 21:59, Martin Maechler wrote:
>>>>>>>Karolis Koncevičius
>>>>>>>    on Sat, 7 Dec 2019 20:55:36 +0200 writes:
>>
>>   > Hello,
>>   > Writing to share some things I've found about wilcox.test() that seem a
>>   > a bit inconsistent.
>>
>>   > 1. Inf values are not removed if paired=TRUE
>>
>>   > # returns different results (Inf is removed):
>>   > wilcox.test(c(1,2,3,4), c(0,9,8,7))
>>   > wilcox.test(c(1,2,3,4), c(0,9,8,Inf))
>>
>>   > # returns the same result (Inf is left as value with highest rank):
>>   > wilcox.test(c(1,2,3,4), c(0,9,8,7), paired=TRUE)
>>   > wilcox.test(c(1,2,3,4), c(0,9,8,Inf), paired=TRUE)
>>
>>Now which of the two cases do you consider correct ?
>>
>>IHMO, the 2nd one is correct: it is exactly one property of the
>>*robustness* of the wilcoxon test and very desirable that any
>>(positive) outlier is treated the same as just "the largest
>>value" and the test statistic (and hence the p-value) should not
>>change.
>>
>>So I think the first case is wrong, notably if modified, (such
>>that the last  y  is the overall maximal value (slightly larger sample):
>>
>>>wilcox.test(1:7, 1/8+ c(9:4, 12))
>>
>>	Wilcoxon rank sum test
>>
>>data:  1:7 and 1/8 + c(9:4, 12)
>>W = 6, p-value = 0.01748
>>alternative hypothesis: true location shift is not equal to 0
>>
>>>wilcox.test(1:7, 1/8+ c(9:4, 10000))
>>
>>	Wilcoxon rank sum test
>>
>>data:  1:7 and 1/8 + c(9:4, 10000)
>>W = 6, p-value = 0.01748
>>alternative hypothesis: true location shift is not equal to 0
>>
>>>wilcox.test(1:7, 1/8+ c(9:4, Inf))
>>
>>	Wilcoxon rank sum test
>>
>>data:  1:7 and 1/8 + c(9:4, Inf)
>>W = 6, p-value = 0.03497
>>alternative hypothesis: true location shift is not equal to 0
>>
>>The  Inf  case should definitely give the same as the  10'000 case.
>>That's exactly one property of a robust statistic.
>>
>>Thank you, Karolis, this is pretty embarrassing to only be detected now after 25+ years of R in use ...
>>
>>The correct fix starts with replacing the  is.finite()  by   !is.na() and keep the 'Inf' in the rank computations...
>>(but then probably also deal with the case of more than one Inf, notably the  Inf - Inf  "exception" which is not triggered by your example...)
>>
>>
>>---
>>
>>Ben addressed the "rounding" / numerical issues unavoidable for
>>the other problems.
>>
>>   > 2. tolerance issues with paired=TRUE.
>>
>>   > wilcox.test(c(4, 3, 2), c(3, 2, 1), paired=TRUE)
>>   > # ...
>>   > # Warning:  cannot compute exact p-value with ties
>>
>>   > wilcox.test(c(0.4,0.3,0.2), c(0.3,0.2,0.1), paired=TRUE)
>>   > # ...
>>   > # no warning
>>
>>   > 3. Always 'x observations are missing' when paired=TRUE
>>
>>   > wilcox.test(c(1,2), c(NA_integer_,NA_integer_), paired=TRUE)
>>   > # ...
>>   > # Error:  not enough (finite) 'x' observations
>>
>>   > 4. No indication if normal approximation was used:
>>
>>   > # different numbers, but same "method" name
>>   > wilcox.test(rnorm(10), exact=FALSE, correct=FALSE)
>>   > wilcox.test(rnorm(10), exact=TRUE, correct=FALSE)
>>
>>
>>   > From all of these I am pretty sure the 1st one is likely unintended,
>>   > so attaching a small patch to adjust it. Can also try patching others if
>>   > consensus is reached that the behavioiur has to be modified.
>>
>>   > Kind regards,
>>   > Karolis Koncevičius.
>>
>>   > ---
>>
>>   > Index: wilcox.test.R
>>   > ===================================================================
>>   > --- wilcox.test.R  (revision 77540)
>>   > +++ wilcox.test.R  (working copy)
>>   > @@ -42,7 +42,7 @@
>>   > if(paired) {
>>   > if(length(x) != length(y))
>>   > stop("'x' and 'y' must have the same length")
>>   > -            OK <- complete.cases(x, y)
>>   > +            OK <- is.finite(x) & is.finite(y)
>>   > x <- x[OK] - y[OK]
>>   > y <- NULL
>>   > }
>>
>>   > ______________________________________________
>>   > R-devel using r-project.org mailing list
>>   > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list