[Rd] Re: [R] p-value > 1 in fisher.test()

Sat Jun 4 19:17:47 CEST 2005

>>>>> "UweL" == Uwe Ligges <ligges at statistik.uni-dortmund.de>
>>>>>     on Sat, 04 Jun 2005 11:43:34 +0200 writes:

    UweL> (Ted Harding) wrote:
    >> On 03-Jun-05 Ted Harding wrote:
    >> 
    >>> And on mine
    >>> 
    >>> (A: PII, Red Had 9, R-1.8.0):
    >>> 
    >>> ff <- c(0,10,250,5000); dim(ff) <- c(2,2);
    >>> 
    >>> 1-fisher.test(ff)$p.value
    >>> [1] 1.268219e-11
    >>> 
    >>> (B: PIII, SuSE 7.2, R-2.1.0beta):
    >>> 
    >>> ff <- c(0,10,250,5000); dim(ff) <- c(2,2);
    >>> 
    >>> 1-fisher.test(ff)$p.value
    >>> [1] -1.384892e-12
    >> 
    >> 
    >> I have a suggestion (maybe it should also go to R-devel).
    >> 
    >> There are many functions in R whose designated purpose is
    >> to return the value of a probability (or a probability
    >> density). This designated purpose is in the mind of the
    >> person who has coded the function, and is implicit in its
    >> usage.
    >> 
    >> Therefore I suggest that every such function should have
    >> a built-in internal check that no probability should be
    >> less than 0 (and if the primary computation yields such
    >> a value then the function should set it exactly to zero),
    >> and should not exceed 1 (in which case the function should
    >> set it exactly to 1). [And, in view of recent echanges,
    >> I would suggest exactly +0, not -0!]
    >> 
    >> Similar for any attempts to return a negative probability
    >> density; while of course a positive value can be allowed
    >> to be anything.
    >> 
    >> All probabilities would then be guaranteed to be "clean"
    >> and issues like the Fisher exact test above would no longer
    >> be even a tiny problem.
    >> 
    >> Implementing this in the possibly many cases where it is
    >> not already present is no doubt a long-term (and tedious)
    >> project.
    >> 
    >> Meanwhile, people who encounter problems due to its absence
    >> can carry out their own checks and adjustments!

    UweL> [moved to R-devel]

    UweL> Ted, my (naive?) objection:
    UweL> Many errors in the underlying code have been detected by a function 
    UweL> returning a nonsensical value, but if the probability is silently set to 
    UweL> 0 or 1 .......
    UweL> Hence I would agree to do so in special cases where it makes sense 
    UweL> because of numerical issues, but please not globally.

I agree very much with Uwe's point.

Further to fisher.test(): This whole thread is
re-hashing a pretty recent  bug report on fisher.test() 
{ "negative p-values from fisher's test (PR#7801)", April '05}
I think that only *because* of the obviously wrong P-values have
we found and confirmed that the refereed and published code
underlying fisher.test() is bogous.   Such knowledge would have
been harder to gain if the P-values would have been cut into [0,1].

Martin Maechler

    UweL> Uwe Ligges

    >> Best wishes to all,
    >> Ted.
    >> 
    >> 
    >> --------------------------------------------------------------------
    >> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
    >> Fax-to-email: +44 (0)870 094 0861
    >> Date: 04-Jun-05                                       Time: 00:02:32
    >> ------------------------------ XFMail ------------------------------