[R] poisson fit for histogram

Ben Bolker bolker at ufl.edu
Thu Jul 21 00:48:31 CEST 2005

Thomas Isenbarger <isen <at> plantpath.wisc.edu> writes:

> I haven't been an R lister for a bit, but I hope to enlist someone's  
> help here.  I think this is a simple question, so I hope the answer  
> is not much trouble.  Can you please respond directly to this email  
> address in addition to the list (if responding to the list is  
> warranted)?
> I have a histogram and I want to see if the data fit a Poisson  
> distribution.  How do I do this?  It is preferable if it could be  
> done without having to install any or many packages.
> I use R Version 1.12 (1622) on OS X
> Thank-you very much,
> Tom Isenbarger
> --
> Tom Isenbarger PhD
> isen <at> plantpath.wisc.edu
> 608.265.0850
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help <at> stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

  by "histogram" do you mean that you have counts for each
non-negative integer?  or are the data more binned than that?
(I'll assume the former since it's slightly easier to deal with.)
Say you have vectors "number" and "count".  The mean of
the sample is meanval <- sum(number*count).  The _expected_ number
of counts in each bin if the distribution is
Poisson is expval <- sum(count)*dpois(number,meanval).
The chi-square statistic is csq <- sum((expval-count)^2/expval), with
df <- length(count)-2 degrees of freedom (for the mean and the total
number of observations).  pchisq(csq,df=df,lower.tail=FALSE)
should give you the chi-squared probability.

  A couple of minor issues: (1) beware, this is shooting from
the hip -- haven't tested at all; (2) you may have to deal
with lumping categories (rule of thumb is that expected number
of counts in a bin should not be < 5).

  Other ways to tackle this: use fitdistr() from MASS with
different candidate distributions (Poisson, neg. bin.) and
do a likelihood ratio test or compare AICs; compare variance and
mean of distribution (much cruder).

for a worked example of a similar problem.

More information about the R-help mailing list