[R] wilcox.test p-value = 0

Keo Ormsby keo.ormsby2 at gmail.com
Fri Sep 18 20:04:29 CEST 2009


Hello Thomas and Bryan,
Thanks for the correction, sorry Murat I was mistaken. Actually your 
answers solved me a problem I was having using multiple fisher.test() on 
nucleic acid sequences, where we come up with hundreds of thousands of p 
values, a lot of which are 0's. Since we have to correct for multiple 
tests, even very, very small p's might end up not being significant, i 
had assumed the 0's were tied p values, but now I know I can use the 
numerator and the denominator to rank the 0's, even if I don't have the 
exact p value.
Best,
Keo.

Marc Schwartz escribió:
> Once one gets past the issue of the p value being extremely small, 
> irrespective of the test being used, the OP has asked the question of 
> how to report it.
>
> Most communities will have standards for how to report p values, 
> covering things like how many significant digits and a minimum p value 
> threshold to report.
>
> For example, in medicine, it is common to report 'small' p values as 
> 'p < 0.001' or 'p < 0.0001'.
>
> Thus, below those numbers, the precision is largely irrelevant and one 
> need not report the actual p value.
>
> I just wanted to be sure that we don't lose sight of the forest for 
> the trees...  :-)
>
> The OP should consult a relevant guidance document or an experienced 
> author in the domain of interest.
>
> HTH,
>
> Marc Schwartz
>
>
> On Sep 16, 2009, at 9:54 AM, Bryan Keller wrote:
>
>> That's right, if the test is exact it is not possible to get a 
>> p-value of zero.  wilcox.test does not provide an exact p-value in 
>> the presence of ties so if there are any ties in your data you are 
>> getting a normal approximation.  Incidentally, if there are any ties 
>> in your data set I would strongly recommend computing the *exact* 
>> p-value because using the normal approximation on tied data sets will 
>> either inflate type I error rate or reduce power depending on how the 
>> ties are distributed.  Depending on the pattern of ties this can 
>> result in gross under or over estimation of the p-value.
>>
>> I guess this is all by way of saying that you should always compute 
>> the exact p-value if possible.
>>
>> The package exactRankTests uses the algorithm by Mehta Patel and 
>> Tsiatis (1984).  If your sample sizes are larger, there is a freely 
>> available .exe by Cheung and Klotz (1995) that will do exact p-values 
>> for sample sizes larger than 100 in each group!
>>
>> You can find it at http://pages.cs.wisc.edu/~klotz/
>>
>> Bryan
>>
>>> Hi Murat,
>>> I am not an expert in either statistics nor R, but I can imagine 
>>> that since the
>>> default is exact=TRUE, It numerically computes the probability, and 
>>> it may
>>> indeed be 0. if you use wilcox.test(x, y, exact=FALSE) it will give 
>>> you a
>>> normal aproximation, which will most likely be different from zero.
>>
>> No, the exact p-value can't be zero for a discrete distribution. The 
>> smallest possible value in this case would, I think, be 
>> 1/choose(length(x)+length(y),length(x)), or perhaps twice that.
>>
>> More generally, the approach used by format.pvalue() is to display 
>> very small p-values as <2e-16, where 2e-16 is machine epsilon.  I 
>> wouldn't want to claim optimality for this choice, but it seems a 
>> reasonable way to represent "very small".
>>
>>     -thomas
>>
>>
>>> Hope this helps.
>>> Keo.
>>>
>>> Murat Tasan escribi?:
>>>> hi, folks,
>>>>
>>>> how have you gone about reporting a p-value from a test when the
>>>> returned value from a test (in this case a rank-sum test) is
>>>> numerically equal to 0 according to the machine?
>>>>
>>>> the next lowest value greater than zero that is distinct from zero on
>>>> the machine is likely algorithm-dependent (the algorithm of the test
>>>> itself), but without knowing the explicit steps of the algorithm
>>>> implementation, it is difficult to provide any non-zero value.  i
>>>> initially thought to look at .Machine at double.xmin, but i'm not
>>>> comfortable with reporting p < .Machine at double.xmin, since without
>>>> knowing the specifics of the implementation, this may not be true!
>>>>
>>>> to be clear, if i have data x, and i run the following line, the
>>>> returned value is TRUE.
>>>>
>>>> wilcox.test(x)$p.value == 0
>>>>
>>>> thanks for any help on this!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list