[R] confidence intervals around p-values

Thu Sep 9 16:24:33 CEST 2010

On 09-Sep-10 13:21:07, Duncan Murdoch wrote:
>   On 09/09/2010 6:44 AM, Fernando Marmolejo Ramos wrote:
>> Dear all
>>
>> I wonder if anyone has heard of confidence intervals around
>> p-values...
> 
> That doesn't really make sense.  p-values are statistics, not 
> parameters. You would compute a confidence interval around a
> population mean because that's a parameter, but you wouldn't
> compute a confidence interval around the sample mean: you've
> observed it exactly.
> 
> Duncan Murdoch

Duncan has succinctly stated the essential point in the standard
interpretation. The P-value is calculated from the sample in
hand, a definite null hypothesis, and the distribution of the
test statistic given the null hyptohesis, so (given all of these)
there is no scope for any other answer.

However, there are circumstances in which the notion of "confidence
interval for a P-value" makes some sense. One such might be the
Mann-Whitney test for identity of distribution of two samples
of continuous variables, where (because of discretisation of the
values when they were recorded) there are ties.

Then you know in theory that the "underlying values" are all
different, but because you don't know where these lie in the
discretisation intervals you don't know which way a tie may
split. So it would make sense to simulate by splitting ties
at random (e.g. uniformly distribute each "1.5" value over the
interval (1.5,1.6) or (1.45,1.55)). 

For each such simulated tie-broken sample, calculate the P-value.
Then you get a distribution of exact P-values calculated from
samples without ties which are consistent with the recorded data.
The central 95% of this distribution could be interpreted as a 95%
coinfidence interval for the true P-value.

To bring this closer to on-topic, here is an example in R
(rounding to intervals of 0.2):

  set.seed(51324)
  X <- sort(2*round(0.5*rnorm(12),1))
  Y <- sort(2*round(0.5*rnorm(12)+0.25,1))
  rbind(X,Y)
#   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
# X -1.8 -1.2 -0.8 -0.6  0.0    0  0.2  0.2  1.2   1.8     2   2.2
# Y -1.2 -0.4 -0.2  0.4  0.4    1  1.0  1.0  1.2   1.8     2   2.6
# So several ties (-1.2,1.2,1.8,2.0), as well as 0.0, 0.4, 1.0
# which don't matter.
wilcox.test(X,Y,alternative="less",exact=TRUE,correct=FALSE)
# data:  X and Y   W = 54, p-value = 0.1488

  Ps <- numeric(1000)
  for(i in (1:1000)){
    Xr <- (X-0.1) + 0.2*runif(10)
    Yr <- (Y-0.1) + 0.2*runif(10)
    Ps[i] <- wilcox.test(Xr,Yr,alternative="less",
             exact=TRUE,correct=FALSE)$p.value
  }
  hist(Ps)
  table(round(Ps,4))
  # 0.1328 0.1457 0.1593 0.1737 0.1888 
  #     81    267    336    226     90 

So this gives you a picture of the uncertainty in the P-value
(0.1488, calculated from the rounded data) relative to what it
really should have been (if calculated from unrounded data).
Since each possible "true" (tie-broken) sample can be viewed
as a hypothesis about unobserved "truth", it does make a certain
sense to view these results as a kind of confidence distribution
for the P-value you should have got. However, this is more of a
Bayesian argument, since the above calculation has assigned
equal prior probability to the tie-breaks!

One could also, I suppose, consider the question of what
distribution of P-values might arise if the/an alternative
huypothesis were true, and where in this does the P-value that
we actually got lie? But these are murkier waters ...

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 09-Sep-10                                       Time: 15:24:29
------------------------------ XFMail ------------------------------