[R] general question on binomial test / sign test

Thu Sep 2 20:01:55 CEST 2010

Just to add to Ted's addition to my response.  I think you are moving towards better understanding (and your misunderstandings are common), but to further clarify:

First, make sure that you understand that the probability of A given B, p(A|B), is different from the probability of B given A, p(B|A).  A simple example from probability class:  I have 3 coins in my pocket, one has 2 heads, one has 2 tails, and the other is fair (p(H)=p(T)=0.5), I take one coin out at random (each has probability =1/3 of being chosen) and flip it, I observe "Heads", what is the probability that it is the 2 headed coin?

So here the probability of Heads given the coin is 2 headed is 1.0, but that does not mean that the probability that I chose the 2 headed coin is 1 given I saw heads (it is 2/3rds).  

The same applies with testing, the p-value is the probability of the data given that the null hypothesis is TRUE.  Many people try to interpret it as the probability that the null hypothesis is true given the data, but that is not the case (not even for Bayesians unless they use a really weird prior).  I think that you are starting to understand this part, high p-values mean that the data is consistent with the null, we cannot reject the null, but they do not prove the null.

A great article for help in understanding p-values better is:

     Murdock, D, Tsai, Y, and Adcock, J (2008) _P-Values are Random
     Variables_. The American Statistician. (62) 242-245.

Which talks about p-values as random variables.  There are a couple of functions in the TeachingDemos package that implement some of the simulations discussed in the article, I would suggest playing with run.Pvalue.norm.sim and run.Pvalue.binom.sim.  Notice that when the null hypothesis is true (and you have a big enough sample size for the binomial case) that the p-values follow a uniform distribution, the p-values 1.0, 0.5, 0.1, and 0.001 are all equally likely.  As the difference between the null hypothesis and the truth increases you get more and more p-values close to 0.

The real tricky bit about hypothesis testing is that we compute a single p-value, a single observation from a distribution, and based on that try to decide if the process that produced that observation is a uniform distribution or something else (that may be close to a uniform or very different).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Kay Cecil Cichini
> Sent: Thursday, September 02, 2010 6:40 AM
> To: ted.harding at manchester.ac.uk
> Cc: r-help at r-project.org
> Subject: Re: [R] general question on binomial test / sign test
> 
> 
> thanks a lot for the elaborations.
> 
> your explanations clearly brought to me that either
> binom.test(1,1,0.5,"two-sided") or binom.test(0,1,0.5) giving a
> p-value of 1 simply indicate i have abolutely no ensurance to reject
> H0.
> 
> considering binom.test(0,1,0.5,alternative="greater") and
> binom.test(1,1,0.5,alternative="less") where i get a p-value of 1 and
> 0.5,respectively - am i right in stating that for the first estimate
> 0/1 i have no ensurance at all for rejection of H0 and for the second
> estimate = 1/1 i have same chance for beeing wrong in either rejecting
> or keeping H0.
> 
> many thanks,
> kay
> 
> 
> 
> Zitat von Ted.Harding at manchester.ac.uk:
> 
> > You state: "in reverse the p-value of 1 says that i can 100% sure
> > that the estimate of 0.5 is true". This is where your logic about
> > significance tests goes wrong.
> >
> > The general logic of a singificance test is that a test statistic
> > (say T) is chosen such that large values represent a discrepancy
> > between possible data and the hypothesis under test. When you
> > have the data, T evaluates to a value (say t0). The null hypothesis
> > (NH) implies a distribution for the statistic T if the NH is true.
> >
> > Then the value of Prob(T >= t0 | NH) can be calculated. If this is
> > small, then the probability of obtaining data at least as discrepant
> > as the data you did obtain is small; if sufficiently small, then
> > the conjunction of NH and your data (as assessed by the statistic T)
> > is so unlikely that you can decide to not believe that it is
> possible.
> > If you so decide, then you reject the NH because the data are so
> > discrepant that you can't believe it!
> >
> > This is on the same lines as the "reductio ad absurdum" in classical
> > logic: "An hypothesis A implies that an outcome B must occur. But I
> > have observed that B did not occur. Therefore A cannot be true."
> >
> > But it does not follow that, if you observe that B did occur
> > (which is *consistent* with A), then A must be true. A could be
> > false, yet B still occur -- the only basis on which occurrence
> > of B could *prove* that A must be true is when you have the prior
> > information that B will occur *if and only if* A is true. In the
> > reductio ad absurdum, and in the parallel logic of significance
> > testing, all you have is "B will occur *if* A is true". The "only if"
> > part is not there. So you cannot deduce that "A is true" from
> > the observation that "B occurred", since what you have to start with
> > allows B to occur if A is false (i.e. "B will occur *if* A is true"
> > says nothing about what may or may not happen if A is false).
> >
> > So, in your single toss of a coin, it is true that "I will observe
> > either 'succ' or 'fail' if the coin is fair". But (as in the above)
> > you cannot deduce that "the coin is fair" if you observe either
> > 'succ' or 'fail', since it is possible (indeed necessary) that you
> > obtain such an observation if the coin is not fair (even if the
> > coin is the same, either 'succ' or 'fail', on both sides, therefore
> > completely unfair). This is an attempt to expand Greg Snow's reply!
> >
> > Your 2-sided test takes the form T=1 if either outcome='succ' or
> > outcome='fail'. And that is the only possible value for T since
> > no other outcome is possible. Hence Prob(T==1) = 1 whether the coin
> > is fair or not. It is not possible for such data to discriminate
> > between a fair and an unfair coin.
> >
> > And, as explained above, a P-value of 1 cannot prove that the
> > null hypothesis is true. All that is possible with a significance
> > test is that a small P-value can be taken as evidence that the
> > NH is false.
> >
> > Hoping this helps!
> > Ted.
> >
> > On 02-Sep-10 07:41:17, Kay Cecil Cichini wrote:
> >> i test the null that the coin is fair (p(succ) = p(fail) = 0.5) with
> >> one trail and get a p-value of 1. actually i want to proof the
> >> alternative H that the estimate is different from 0.5, what
> certainly
> >> can not be aproven here. but in reverse the p-value of 1 says that i
> >> can 100% sure that the estimate of 0.5 is true (??) - that's the
> point
> >> that astonishes me.
> >>
> >> thanks if anybody could clarify this for me,
> >> kay
> >>
> >> Zitat von Greg Snow <Greg.Snow at imail.org>:
> >>
> >>> Try thinking this one through from first principles, you are
> >>> essentially saying that your null hypothesis is that you are
> >>> flipping a fair coin and you want to do a 2-tailed test.  You then
> >>> flip the coin exactly once, what do you expect to happen?  The
> >>> p-value of 1 just means that what you saw was perfectly consistent
> >>> with what is predicted to happen flipping a single time.
> >>>
> >>> Does that help?
> >>>
> >>> If not, please explain what you mean a little better.
> >>>
> >>> --
> >>> Gregory (Greg) L. Snow Ph.D.
> >>> Statistical Data Center
> >>> Intermountain Healthcare
> >>> greg.snow at imail.org
> >>> 801.408.8111
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >>>> project.org] On Behalf Of Kay Cichini
> >>>> Sent: Wednesday, September 01, 2010 3:06 PM
> >>>> To: r-help at r-project.org
> >>>> Subject: [R] general question on binomial test / sign test
> >>>>
> >>>>
> >>>> hello,
> >>>>
> >>>> i did several binomial tests and noticed for one sparse dataset
> that
> >>>> binom.test(1,1,0.5) gives a p-value of 1 for the null, what i
> can't
> >>>> quite
> >>>> grasp. that would say that the a prob of 1/2 has p-value of 0 ?? -
> i
> >>>> must be
> >>>> wrong but can't figure out the right interpretation..
> >>>>
> >>>> best,
> >>>> kay
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -----
> >>>> ------------------------
> >>>> Kay Cichini
> >>>> Postgraduate student
> >>>> Institute of Botany
> >>>> Univ. of Innsbruck
> >>>> ------------------------
> >>>>
> >>>> --
> >>>> View this message in context:
> http://r.789695.n4.nabble.com/general-
> >>>> question-on-binomial-test-sign-test-tp2419965p2419965.html
> >>>> Sent from the R help mailing list archive at Nabble.com.
> >>>>
> >>>> ______________________________________________
> >>>> R-help at r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>>> guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > --------------------------------------------------------------------
> > E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> > Fax-to-email: +44 (0)870 094 0861
> > Date: 02-Sep-10                                       Time: 09:42:34
> > ------------------------------ XFMail ------------------------------
> >
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.