[R] general question on binomial test / sign test

Greg Snow Greg.Snow at imail.org
Fri Sep 3 16:19:46 CEST 2010


Ted,

I agree that we are measuring discrepancies and that large discrepancies correspond to p-values near 0 and small discrepancies correspond to large p-values.  But interpreting discrepancies on a p-value scale leads more to confusion than understanding.  If you are interested in the discrepancy, then focus on the meaningful discrepancy scale (confidence intervals are great in many of these cases).  I also agree that small p-values corresponding to large discrepancies is meaningful in saying that the large discrepancy is indicative of a real difference rather than just luck.

My point was more focused on the over interpretation of differences in large p-values (remember this thread started with the original poster misinterpreting a p-value of 1).  Try this exercise:  Consider a sample of size 100 from a normal population with known standard deviation of 1.  The null hypothesis is that the true mean is 50, what sample mean(s) will result in a p-value of 0.4? a p-value of 0.9?  Is the difference between the 2 discrepancies worth getting excited about?  Compare what conclusions you would draw by comparing the 2 confidence intervals to what might be concluded by comparing the 2 p-values.

The difference between a p-value of 0.01 and 0.1 is very meaningful (if using an alpha=0.05 or close), the difference between a p-value of 0.4 and 0.9 is much less meaningful even though the difference is bigger.

Also for alpha=0.05, I don't think it is worth getting any more excited over a p-value of 0.000000001 than one of 0.0001, but people do.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: Ted Harding [mailto:Ted.Harding at manchester.ac.uk]
> Sent: Thursday, September 02, 2010 3:59 PM
> To: Greg Snow
> Cc: r-help at r-project.org; Kay Cecil Cichini
> Subject: Re: [R] general question on binomial test / sign test
> 
> On 02-Sep-10 18:01:55, Greg Snow wrote:
> > Just to add to Ted's addition to my response.  I think you are moving
> > towards better understanding (and your misunderstandings are common),
> > but to further clarify:
> > [Wise words about P(A|B), P(B|A), P-values, etc., snipped]
> >
> > The real tricky bit about hypothesis testing is that we compute a
> > single p-value, a single observation from a distribution, and based
> on
> > that try to decide if the process that produced that observation is a
> > uniform distribution or something else (that may be close to a
> uniform
> > or very different).
> 
> Indeed. And this is precisely why I began my original reply as follows:
> 
> >> Zitat von Ted.Harding at manchester.ac.uk:
> >>> [...]
> >>> The general logic of a singificance test is that a test statistic
> >>> (say T) is chosen such that large values represent a discrepancy
> >>> between possible data and the hypothesis under test. When you
> >>> have the data, T evaluates to a value (say t0). The null hypothesis
> >>> (NH) implies a distribution for the statistic T if the NH is true.
> >>>
> >>> Then the value of Prob(T >= t0 | NH) can be calculated. If this is
> >>> small, then the probability of obtaining data at least as
> discrepant
> >>> as the data you did obtain is small; if sufficiently small, then
> >>> the conjunction of NH and your data (as assessed by the statistic
> T)
> >>> is so unlikely that you can decide to not believe that it is
> >>> possible.
> >>> If you so decide, then you reject the NH because the data are so
> >>> discrepant that you can't believe it!
> 
> The point is that the test statistic T represents *discrepancy*
> between data and NH in some sense. In what sense? That depends on
> what you are interested in finding out; and, whatever it is,
> there will be some T that represents it.
> 
> It might be whether two samples come from distributions with equal
> means, or not. Then you might use T = mean(Ysample) - mean(Xsample).
> Large values of |T| represent discrepancy (in either direction)
> between data and an NH that the true means are equal. Large values
> of T, discrepancy in the positive direction, large values of -T
> diuscrepancy in the negative direction. Or it might be whether or
> not the two samples are drawn from populations with equal variances,
> when you might use T = var(Ysample)/var(Xsample). Or it might be
> whether the distribution from which X was sampled is symmetric,
> in which case you might use skewness(Xsample). Or you might be
> interested in whether the numbers falling into disjoint classes
> are consistent with hypothetical probabilities p1,...,pk of
> falling into these classes -- in which case you might use the
> chi-squared statistic T = sum(((ni - N*pi)^2)/(N*pi)). And so on.
> 
> Once you have decided on what "discrepant" means, and chosen a
> statistic T to represent discrepancy, then the NH implies a
> distribution for T and you can calculate
>   P-value = Prob(T >= t0 | NH)
> where t0 is the value of T calculated from the data.
> 
> *THEN* small P-value is in direct correspondence with large T,
> i.e. small P is equivalent to large discrepancy. And it is also
> the direct measure of how likely you were to get so large a
> discrepancy if the NH really was true.
> 
> Thus the P-values, calculated from the distribution of (T | NH),
> are ordered, not just numerically from small P to large, but also
> equivalently by discrepancy (from large discrepancy to small).
> 
> Thus the uniform distribution of P under the NH does not just
> mean that any value of P is as likely as any other, so "So what?
> Why prefer on P-value to another?"
> 
> We also have that different parts of the [0,1] P-scale have
> different *meanings* -- the parts near 0 are highly discrepant
> from NH, the parts near 1 are highly consistent with NH,
> *with respect to the meaning of "discrepancy" implied by the
> choice of test statistic T*.
> 
> So it helps to understand hypothesis testing if you keep in
> mind what the test statistic T *represents* in real terms.
> 
> Greg's point about "try to decide if the process that produced that
> observation is a uniform distribution or something else (that may
> be close to a uniform or very different)" is not in the first instance
> relevant to the direct interpretation of small P-value as large
> discrepancy -- that involves only the Null Hypothesis NH, under
> which the P-values have a uniform distribution.
> 
> Where it somes into its own is that an Alternative Hypothesis AH
> would correspond to some degree of discrepancy of a certain kind,
> and if T is well chosen then its distribution under AH will give
> large values of T greater probability than they would get under NH.
> Thus the AHs that are implied by a large value of a certain test
> statistic T are those AHs that give such values of T greater
> probability than they would get under NH. Thus we are now getting
> into the domain of the Power of the test to detect discrepancy.
> 
> Ted.
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 02-Sep-10                                       Time: 22:59:23
> ------------------------------ XFMail ------------------------------



More information about the R-help mailing list