[R] Two-tailed exact binomial test with binom.test and sum(dbinom(...))

peter dalgaard pdalgd at gmail.com
Sun Dec 14 16:21:09 CET 2014

> On 14 Dec 2014, at 13:54 , Stefan Evert <stefanML at collocations.de> wrote:
>> (3) What is people's view on computing the two-tailed test like this,
>> which leads to an ns result unlike binom.test?
>> 2*sum(dbinom(51:235, 235, 1/6)) # 0.05308849
> This is a popular approximation (which I also use most of the time) because it's much less expensive (in computational terms) than computing an exact (likelihood-based) two-tailed p-value as binom.test() does.  This is particularly relevant if you want to compute confidence intervals for the true probability p based on a large sample, which takes ages with binom.test().

When I get drilled about this, I usually say that one really shouldn't use "two-tailed" and "exact" in the same sentence, because of the issue with the definition of tails. I don't agree that the version in binom.test is in any sense _the_ correct one and we probably should make alternatives optional at some point. 

One point that is easily overlooked (guilty!) is that defining the p-value as the sum over less probable outcomes is _not_ a likelihood theory technique. The likelihood ratio test should have a denominator equal to the maximum probability of the outcome when the parameter is allowed to vary from the null value. It is not that hard to do the actual LRT:

> LRT <-  -2*log(dbinom(0:235,235,1/6)/dbinom(0:235,235,(0:235)/235))
> dist_null <- dbinom(0:235, 235, 1/6)
> sum(dist_null[LRT >= LRT_obs])
[1] 0.05373588 

I believe there are four reasonable contenders for the two sided p-value:

1) sum of probabilities of less or equally probable outcomes
2) sum of probabilities of outcomes with more extreme LRT
3) double minimum one-tailed p
4) tail-balancing: one-sided p plus the max opposite tail probability less than p

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

More information about the R-help mailing list