[R] Automating binning for chisq.test()

Marc Schwartz marc_schwartz at comcast.net
Fri Oct 12 20:06:12 CEST 2007


On Fri, 2007-10-12 at 11:16 -0600, D. R. Evans wrote:
> The standard chisq.test() and fisher.test() functions, when applied to
> two distributions (to determine whether the same underlying
> distribution applies to both) requires one to pre-bin the
> distributions.
> 
> Is there a library function (either built-in or in a package) that
> acts more like the ks.test() function, in that one can simply pass the
> two distributions and have it do the necessary binning as well as the
> actual statistical test?
> 
> (Yes, you can accuse me of laziness: I just don't fancy trying to
> figure out a routine that would make sure that there more than 5
> samples in each of the expected bins before applying the chi-squared
> test. It seems too much like re-inventing an elementary wheel that
> must have been invented by someone else.)

You might want to review the following article:

Chi-squared and Fisher-Irwin tests of two-by-two tables with small 
sample recommendations 
Ian Campbell
Stat in Med 26:3661-3675; 2007 
http://www3.interscience.wiley.com/cgi-bin/abstract/114125487/ABSTRACT 


Frank Harrell has offered some comments here (bottom of page): 

  http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataAnalysisDisc

HTH,

Marc Schwartz



More information about the R-help mailing list