[Rd] [Suggested patch] to fligner.test - constant values can produce significant results

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Jun 25 10:11:53 CEST 2019

>>>>> Karolis K 
>>>>>     on Fri, 21 Jun 2019 18:00:36 +0300 writes:

    > In specific cases fligner.test() can produce a small p-value even when both
    > groups have constant variance.

    > Here is an illustration:

    > fligner.test(c(1,1,2,2), c("a","a","b","b"))
    > # p-value = NA

    > But:

    > fligner.test(c(1,1,1,2,2,2), c("a","a","a","b","b","b"))
    > # p-value < 2.2e-16

    > This can potentially get dangerous if people perform lots of parallel tests
    > of this type (i.e. when doing a test for each gene in genomic studies).

I agree; this is really misleading and  dangerously wrong.

    > Submitted a proposed patch that should solve the issue by producing an
    > error "data is essentially constant" - which is the same error message
    > found in t-test under similar conditions.

I'm much less agreeing on that remedy (and also the solution for t.test()):
In many similar situations, it has been very fruitful to have R's
algorithms behave "generalized continuous"ly.  I have defined
this as (something like)

 "The value at infinity should correspond to the limit going to infinity"

 (and also applying that to  1/0 == Inf which is the correct
  limit in the case where the "0" is known to be non-negative,
  as here for the variance/sd) 

In this case (H0: variances in groups are equal), I'd argue that
H0 should *not* be rejected, and the "most correct" P-value to
be 1 .   After all, both groups have the same variance, 0.

In the t.test() case (H0: group means are equal; variances are
equal (or not: that's optional  var.equal = TRUE / FALSE):
When the two group variances are 0, there are 2 cases, and I
claim something like the following should happen by
"generalized continuity":

1) if the group means are equal  H0 is not rejected     (P = 1)
2) if the group means differ,    H0 is clearly rejected (P = 0)

{where for '1)' I could also agree on being undecided and returning P = NaN}

Returning an error in this case, as t.test() has been doing,
seems a waste (loss of information) in my view.
But for now, let's not discuss t.test() but the "var tests"
(homogeneity of variances)

    > P.S. First time writing to this list. Read all the guides of posting, but
    > sorry in advance if I still missed any rules.

well, thank you, but your post is really "perfect" in all formal senses
(correct mailing list, reproducible example code, using plain
 text, being polite ;-), even proposing a patch via diff )
==> really very well done and a role model for others!

Thank you indeed for raising the issue and proposing a patch.
We should discuss here ... i.e. hear other opinions etc.

Note that there ca 5 different such tests for homogeneity of variances
(fligner.test, bartlett.test, var.test, ansari.test, mood.test)
and the behavior of the other 4 tests should also be considered ..

    > svn.diff:

    > Index: src/library/stats/R/fligner.test.R
    > ===================================================================
    > --- src/library/stats/R/fligner.test.R  (revision 76710)
    > +++ src/library/stats/R/fligner.test.R  (working copy)
    > @@ -55,6 +55,8 @@

    > ## Careful. This assumes that g is a factor:
    > x <- x - tapply(x,g,median)[g]
    > +    if (all(x == 0))
    > +      stop("data are essentially constant")

    > a <- qnorm((1 + rank(abs(x)) / (n + 1)) / 2)
    > STATISTIC <- sum(tapply(a, g, "sum")^2 / tapply(a, g, "length"))

    > ---
    > Karolis Koncevičius

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list