[R] qqplot for binomial distribution

Ashim Kapoor ashimkapoor at gmail.com
Thu May 11 12:00:19 CEST 2017


Dear All,

when I do :

set.seed(123)

expected_distribution<-rbinom(1000,100,.05)

#Without jitter

qqplot(jitter(expected_distribution),count1_vector, xlab="Expected
distribution",ylab="Observed values")
qqline(count1_vector,distribution = function(probs) { qbinom(probs,
size=100, prob=0.05) },col = "red",lwd = 2)

I get a line through the middle. Is this satisfactory ?

#With jitter
qqplot(jitter(expected_distribution),jitter(count1_vector),xlab="Expected
distribution",ylab="Observed values")
qqline(jitter(count1_vector),distribution = function(probs) { qbinom(probs,
size=100, prob=0.05) },col = "red",lwd = 2)

Now I can see the line is  not exactly through the middle. Can I think that
this is due to the way the data is discretized?

count1_vector <-
c(2, 6, 5, 8, 8, 6, 8, 3, 5, 8, 7, 6, 4, 7, 5, 2, 3, 3, 6, 3,
7, 3, 4, 8, 6, 6, 3, 5, 5, 3, 4, 5, 4, 2, 3, 6, 7, 7, 5, 4, 4,
7, 9, 4, 4, 4, 8, 9, 3, 5, 4, 7, 4, 3, 8, 6, 5, 5, 7, 3, 6, 7,
8, 7, 9, 3, 5, 5, 9, 8, 7, 7, 2, 3, 5, 2, 4, 14, 7, 7, 7, 3,
5, 4, 2, 12, 3, 6, 9, 4, 4, 3, 4, 4, 4, 6, 7, 4, 6, 10, 8, 5,
3, 3, 1, 3, 4, 3, 7, 3, 9, 7, 3, 3, 7, 5, 1, 2, 2, 3, 5, 4, 3,
8, 7, 0, 5, 3, 3, 4, 9, 2, 7, 5, 5, 5, 7, 7, 5, 4, 7, 2, 3, 5,
4, 5, 2, 10, 6, 3, 6, 2, 11, 2, 5, 5, 2, 5, 2, 10, 4, 5, 9, 1,
5, 6, 3, 4, 7, 7, 2, 2, 3, 6, 6, 6, 7, 3, 3, 6, 1, 4, 4, 4, 10,
4, 7, 4, 3, 4, 6, 5, 6, 7, 3, 7, 3, 5, 6, 6, 4, 5, 1, 7, 5, 7,
6, 7, 5, 3, 6, 7, 10, 5, 5, 4, 9, 6, 3, 9, 8, 4, 2, 8, 10, 4,
6, 7, 4, 4, 8, 4, 5, 4, 5, 6, 6, 5, 8, 2, 2, 6, 5, 3, 7, 7, 4,
9, 6, 5, 7, 8, 6, 1, 2, 3, 4, 4, 6, 8, 5, 8, 5, 7, 7, 4, 6, 3,
4, 5, 3, 4, 5, 3, 3, 4, 5, 5, 7, 8, 6, 5, 3, 3, 4, 3, 8, 6, 6,
3, 0, 4, 2, 7, 3, 5, 4, 6, 7, 4, 7, 6, 5, 8, 6, 7, 4, 5, 3, 6,
7, 6, 3, 5, 3, 3, 6, 3, 3, 2, 7, 5, 10, 2, 4, 5, 2, 4, 10, 5,
2, 7, 8, 5, 3, 7, 4, 2, 4, 3, 5, 6, 8, 10, 3, 7, 5, 8, 5, 2,
6, 8, 6, 7, 8, 2, 4, 2, 4, 3, 2, 4, 4, 2, 4, 3, 12, 2, 11, 5,
8, 8, 3, 6, 2, 6, 3, 5, 4, 8, 4, 5, 7, 2, 5, 3, 5, 3, 7, 6, 5,
2, 8, 6, 3, 3, 5, 3, 2, 6, 5, 8, 7, 4, 2, 3, 5, 2, 6, 4, 9, 5,
4, 4, 2, 1, 3, 3, 2, 5, 7, 6, 4, 4, 5, 6, 7, 4, 4, 4, 5, 9, 7,
5, 3, 5, 5, 2, 11, 9, 6, 8, 6, 6, 8, 6, 3, 6, 3, 7, 3, 3, 7,
4, 7, 4, 8, 3, 4, 8, 8, 8, 7, 4, 6, 1, 3, 7, 5, 13, 8, 1, 8,
5, 1, 3, 5, 4, 6, 5, 4, 3, 3, 7, 5, 5, 5, 3, 5, 5, 1, 8, 6, 4,
5, 9, 3, 8, 6, 4, 7, 6, 7, 5, 5, 6, 2, 7, 8, 11, 10, 4, 8, 5,
5, 4, 5, 4, 2, 8, 3, 3, 4, 5, 7, 12, 4, 7, 5, 4, 9, 8, 4, 5,
9, 4, 6, 5, 5, 2, 3, 4, 7, 7, 7, 7, 1, 6, 6, 6, 4, 8, 8, 5, 7,
3, 4, 6, 2, 6, 6, 4, 8, 4, 3, 7, 1, 4, 6, 2, 3, 5, 5, 9, 7, 1,
4, 1, 5, 3, 5, 4, 3, 5, 10, 4, 8, 6, 4, 3, 5, 6, 4, 6, 6, 4,
7, 7, 6, 5, 4, 4, 6, 10, 6, 5, 3, 8, 3, 4, 3, 5, 5, 2, 6, 6,
8, 2, 5, 9, 6, 5, 5, 4, 10, 7, 3, 5, 6, 8, 5, 3, 3, 7, 3, 4,
6, 2, 9, 2, 6, 5, 3, 6, 2, 4, 3, 4, 5, 5, 1, 5, 4, 11, 4, 1,
9, 5, 4, 7, 2, 11, 4, 9, 6, 5, 5, 6, 6, 7, 9, 4, 4, 4, 4, 3,
7, 3, 3, 4, 2, 6, 6, 6, 4, 6, 2, 5, 6, 5, 4, 3, 4, 7, 8, 7, 3,
5, 4, 4, 4, 4, 4, 4, 2, 7, 3, 5, 7, 1, 5, 5, 2, 7, 3, 3, 5, 3,
5, 4, 9, 5, 7, 8, 7, 7, 4, 5, 3, 5, 6, 5, 1, 6, 5, 5, 8, 7, 3,
6, 8, 1, 12, 1, 7, 6, 6, 3, 4, 4, 2, 2, 3, 2, 8, 4, 3, 7, 9,
10, 5, 5, 6, 7, 3, 7, 4, 7, 7, 3, 5, 9, 7, 3, 6, 6, 2, 5, 4,
3, 5, 8, 5, 6, 3, 4, 5, 2, 4, 3, 4, 5, 2, 7, 2, 7, 5, 5, 6, 8,
4, 8, 6, 5, 4, 5, 1, 6, 6, 2, 4, 8, 5, 7, 6, 10, 6, 4, 4, 4,
9, 5, 3, 1, 10, 7, 5, 6, 4, 7, 5, 6, 4, 2, 4, 6, 5, 3, 3, 6,
5, 9, 3, 7, 9, 4, 1, 4, 2, 4, 5, 4, 4, 2, 7, 11, 3, 3, 5, 8,
3, 5, 7, 9, 6, 11, 6, 5, 3, 7, 5, 3, 7, 4, 5, 4, 4, 8, 3, 3,
5, 4, 3, 7, 4, 2, 10, 2, 4, 3, 8, 4, 4, 5, 3, 3, 6, 2, 7, 2,
2, 11, 1, 6, 3, 6, 5, 7, 3, 3, 1, 7, 9, 8, 7, 2, 5, 4, 3, 7,
7, 2, 5, 4, 3, 3, 6, 10, 4, 9, 6, 5, 3, 4, 5, 5, 6, 6, 7, 3,
4, 8, 6, 4, 5, 1, 5, 9, 3, 6, 2, 4, 5, 5, 3, 3, 3, 3, 5, 4, 4,
5, 5, 1, 4, 5, 8, 7, 4, 3, 3, 5, 5, 4, 6, 5, 4, 7, 4, 4, 3, 3,
8, 4, 6, 7, 3, 4, 3, 5, 5, 7, 3, 6, 9, 7, 4, 3, 2, 6)








On Wed, Apr 19, 2017 at 12:32 PM, Ashim Kapoor <ashimkapoor at gmail.com>
wrote:

> Dear Boris,
>
> Many thanks,
> Ashim
>
> On Tue, Apr 18, 2017 at 7:56 PM, Boris Steipe <boris.steipe at utoronto.ca>
> wrote:
>
>> As per the help pages, the data samples are expected in the second
>> argument, "y".
>>
>> So try
>>   qqplot(rbinom(n=100, size=100, p=0.05), count1_vector)
>>
>> ... and then plot your qqline()
>>
>> Alternatively, try
>>
>> qqline(count1_vector,
>>        distribution = function(probs) { qbinom(probs, size=100,
>> prob=0.05) },
>>        datax = TRUE, # <- logical. Should data values be on the x-axis?
>>        col = "red",
>>        lwd = 0.5)
>> ... and use your original qqplot()
>>
>>
>> B.
>>
>>
>> > On Apr 18, 2017, at 12:47 AM, Ashim Kapoor <ashimkapoor at gmail.com>
>> wrote:
>> >
>> > Dear Boris,
>> >
>> > Thank you for your reply.
>> >
>> > > dput(count1_vector)
>> > c(5, 6, 4, 4, 6, 5, 4, 5, 3, 7, 5, 5, 3, 4, 8, 6, 10, 2, 4, 6,
>> > 8, 4, 4, 6, 8, 5, 6, 3, 7, 9, 4, 7, 5, 7, 3, 4, 5, 2, 11, 7,
>> > 8, 5, 5, 6, 3, 2, 3, 5, 9, 6, 5, 6, 7, 3, 10, 7, 6, 4, 9, 5,
>> > 7, 3, 7, 3, 2, 3, 4, 5, 10, 4, 5, 5, 6, 7, 4, 8, 7, 5, 5, 4,
>> > 8, 7, 9, 4, 4, 4, 7, 5, 4, 10, 4, 5, 6, 1, 3, 5, 4, 7, 4, 6)
>> >
>> > set.seed(123)
>> > qqplot(count1_vector,rbinom(n=100,size=100,p=.05))
>> > qqline(count1_vector,distribution = function(probs) { qbinom(probs,
>> size=100, prob=0.05) },
>> >        col = "red",
>> >        lwd = 0.5)
>> >
>> > When I do this, the line does not pass through the center of my data.I
>> do expect count1_vector to be 100 samples of binomial with n=100 and p=.05.
>> >
>> > Any comments or suggestions for me ?
>> >
>> > Note : I built a 95% Confidence interval for my data and I counted how
>> often out of 100 times did the data fall outside the CI.This I expect to be
>> binomial with n=100,p=.05. I repeated this a 100 times and obtained
>> count1_vector.
>> >
>> > Best Regards,
>> > Ashim.
>> >
>> >
>> > On Mon, Apr 17, 2017 at 7:51 PM, Boris Steipe <boris.steipe at utoronto.ca>
>> wrote:
>> > That's not how qqline() works. The line is drawn with respect to a
>> _reference_distribution_ which is the normal distribution by default. For
>> the binomial distribution, you need to specify the distribution argument.
>> There is an example in the help page that shows you how this is done for
>> qchisq(). for qbinom() it is:
>> >
>> >
>> > set.seed(123)
>> > qqplot(rbinom(n=100, size=100, p=0.05),
>> >        rbinom(n=100, size=100, p=0.05) )
>> >
>> > qqline(rbinom(n=100,size=100,p=.05),
>> >        distribution = function(probs) { qbinom(probs, size=100,
>> prob=0.05) },
>> >        col = "red",
>> >        lwd = 0.5)
>> >
>> >
>> >
>> >
>> > B.
>> >
>> >
>> > > On Apr 17, 2017, at 9:15 AM, Ashim Kapoor <ashimkapoor at gmail.com>
>> wrote:
>> > >
>> > > Dear Spencer,
>> > >
>> > > Okay. Many thanks. My next query is how do I use qqline?
>> > >
>> > > When I try
>> > >
>> > >> qqline(rbinom(n=100,size=100,p=.05))
>> > >
>> > > I don't get the line in the right place.
>> > >
>> > > Best Regards,
>> > > Ashim
>> > >
>> > > On Mon, Apr 17, 2017 at 6:31 PM, Spencer Graves <
>> > > spencer.graves at effectivedefense.org> wrote:
>> > >
>> > >>
>> > >>
>> > >> On 2017-04-17 7:58 AM, Ashim Kapoor wrote:
>> > >>
>> > >>> Dear All,
>> > >>>
>> > >>> set.seed(123)
>> > >>> qqplot(rbinom(n=100,size=100,p=.05), rbinom(n=100,size=100,p=.05) )
>> > >>>
>> > >>> I expect to see 1 clear line,but I don't. What am I
>> misunderstanding?
>> > >>>
>> > >>
>> > >>
>> > >>      The distribution is discrete, and points are superimposed. Try
>> the
>> > >> following:
>> > >>
>> > >>
>> > >> set.seed(123)
>> > >> qqplot(jitter(rbinom(n=100,size=100,p=.05)),
>> > >>       jitter(rbinom(n=100,size=100,p=.05) ))
>> > >>
>> > >>
>> > >>      Spencer Graves
>> > >>
>> > >>>
>> > >>> Best Regards,
>> > >>> Ashim
>> > >>>
>> > >>>        [[alternative HTML version deleted]]
>> > >>>
>> > >>> ______________________________________________
>> > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >>> PLEASE do read the posting guide http://www.R-project.org/posti
>> > >>> ng-guide.html
>> > >>> and provide commented, minimal, self-contained, reproducible code.
>> > >>>
>> > >>
>> > >> ______________________________________________
>> > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >> PLEASE do read the posting guide http://www.R-project.org/posti
>> > >> ng-guide.html
>> > >> and provide commented, minimal, self-contained, reproducible code.
>> > >>
>> > >
>> > >       [[alternative HTML version deleted]]
>> > >
>> > > ______________________________________________
>> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>>
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list