[R] null distribution of binom.test p values

Fri Jan 27 16:25:20 CET 2012

Greg, Thomas,

thank you dot the detailed and lucid replies.  I understand now.  I am 
doing multiple tests and wanted to present an overview of results using 
pp plots, which were looking very underdispersed.  Now I understand why, 
I think I can generate a "correct" expected distribution which should at 
least reassure the biologists when they view them.

Many thanks, Chris.

On 26/01/12 20:03, Thomas Lumley wrote:
> On Fri, Jan 27, 2012 at 5:36 AM, Chris Wallace
> <chris.wallace at cimr.cam.ac.uk>  wrote:
>> Greg, thanks for the reply.
>>
>> Unfortunately, I remain unconvinced!
>>
>> I ran a longer simulation, 100,000 reps.  The size of the test is
>> consistently too small (see below) and the histogram shows increasing bars
>> even within the parts of the histogram with even bar spacing.  See
>> https://www-gene.cimr.cam.ac.uk/staff/wallace/hist.png
>>
>> y<-sapply(1:100000, function(i,n=100)
>>   binom.test(sum(rnorm(n)>0),n,p=0.5,alternative="two")$p.value)
>> mean(y<0.01)
>> # [1] 0.00584
>> mean(y<0.05)
>> # [1] 0.03431
>> mean(y<0.1)
>> # [1] 0.08646
>>
>> Can that really be due to the discreteness of the distribution?
>
> Yes.  All so-called exact tests tend to be conservative due to
> discreteness, and there's quite a lot of discreteness in the tails
>
> The problem is far worse for Fisher's exact test, and worse still for
> Fisher's other exact test (of Hardy-Weinberg equilibrium --
> http://www.genetics.org/content/180/3/1609.full).
>
> You don't need to rely on finite-sample simulations here: you can
> evaluate the level exactly.  Using binom.test() you find that the
> rejection regions are y<=39 and y>=61, so the level at nominal 0.05
> is:
>> pbinom(39,100,0.5)+pbinom(60,100,0.5,lower.tail=FALSE)
> [1] 0.0352002
> agreeing very well with your 0.03431
>
> At nominal 0.01 the exact level is
>> pbinom(36,100,0.5)+pbinom(63,100,0.5,lower.tail=FALSE)
> [1] 0.006637121
> and at 0.1 it is
>> pbinom(41,100,0.5)+pbinom(58,100,0.5,lower.tail=FALSE)
> [1] 0.08862608
>
> Your result at nominal 0.01 is a bit low, but I think that's bad luck.
>   When I ran your code I got 0.00659 for the estimated level at nominal
> 0.01, which matches the exact calculations very well
>
>
> Theoreticians sweep this under the carpet by inventing randomized
> tests, where you interpolate a random p-value between the upper and
> lower values from a discrete distribution.  It's a very elegant idea
> that I'm glad to say I haven't seen used in practice.
>
>       -thomas
>