[R] Small p from binomial probability function.

Thu Oct 10 16:56:37 CEST 2013

Hi,

Thank you for your answers, I'm not completely sure if it's to bino.test I need or the uniroot. Perhaps I should explain more the idea behind the code and the actual task I'm trying to do. The idea is to calculate a confidence interval as to the age of two DNA sequences which have diverged, where I know the number of mutations that happened in them, and I know the mutation rate.

The binomial probability can be used since, mutations have a probability of occurring or being observed so many times in a sequence. This is dependent on the length of the DNA stretch (which equates to the number of trials since each base is a possibility of observing a mutation), the probability of a single mutation occurring which is p = t * u, since more time means a higher probability a mutation may have occurred.  

So my code, using pbinom, is supposed to calculate the probability that my DNA stretches contain the number of mutations observed P(X = k), given their size (trials) and the probability of a single mutation (p = t * u). However I'm interested in finding t: t is what is unknown, so the loop repeatedly evaluates the calculation, increasing t each time and checking P(X=k), when it is 0.05, 0.50 and 0.95, we record t.

Ideally I'd like to rearrange this so I can get the probability of a single success (mutation) p, and then divide by the mutation rate to get my t. My supervisor gave my the loopy code but I imagine there is a way to plug in P(X=k) as 0.05 and 0.95 and get my upper and lower t estimates.

According to the R built in docs:

binom.test
Description:

     Performs an exact test of a simple null hypothesis about the
     probability of success in a Bernoulli experiment.

Perhaps this is the one I need rather than uniroot?

Best,
Ben.

________________________________________
From: Stefan Evert [stefanML at collocations.de]
Sent: 10 October 2013 09:37
To: R-help Mailing List
Cc: Benjamin Ward (ENV)
Subject: Re: [R] Small p from binomial probability function.

Sounds like you want a 95% binomial confidence interval:

        binom.test(N, P)

will compute this for you, and you can get the bounds directly with

        binom.test(N, P)$conf.int

Actually, binom.test computes a two-sided confidence interval, which corresponds roughly to 2.5 and 97.5 percentages in your approach. It doesn't give you the 50% point either, but I don't think that's a meaningful quantity with a two-sided test.

Hope this helps,
Stefan

On 9 Oct 2013, at 15:53, Benjamin Ward (ENV) <B.Ward at uea.ac.uk> wrote:

> I got given some code that uses the R function pbionom:
>
> p <- mut * t
> sumprobs <- pbinom( N, B, p ) * 1000
>
> Which gives the output of a probability as a percentage like 5, 50, 95.
>
> What the code currently does is find me the values of t I need, by using the above two code lines in a loop, each iteration it increaces t by one and runs the two lines. When sumprobs equals 5, it records the value t, then again when sumprobs is equal to 50, and again when sumprobs is equal to 95 - giving me three t values. This is not an efficient way of doing this if t is large. Is it possible to rearrange pbinom so it gives me the small p (made of mut*t) as the result of plugging in the sumprobs instead, and is there an R function that already does this?
>
> Since pbinom is the binomial probability equation I suppose the question is - in more mathematical terminology - can I change this code so that instead of calculating the Probability of N successes given the number of trials and the probability of a single success, can I instead calculate the probability of a single success using the probability of N successes and number of trials, and the number of successes? Can R do this for me. So instead I plug in 5, 50, and 95, and then get the small p out?