[R] Fishers exact test at < 2.2e-16

Christian Hennig chrish at stats.ucl.ac.uk
Thu Dec 17 15:24:54 CET 2009


I know that you didn't ask for this but to me this seems to be a very 
dodgy method to select a "best number of clusters" with no proper basis at 
all. All of these tests are data dependent, so the p-values cannot be 
interpreted in the usual way. It is actually not clear how they can be 
interpreted, and the freedom in the data to find a clustering depends on 
the number of clusters, so there is no reason to expect that comparing 
p-values for different numbers tells you anything meaningful. Do you 
really think that it is an informative difference if one clustering gives
you p=10^{-58} and another one 10^{-30}?

Christian

On Thu, 17 Dec 2009, Søren Faurby wrote:

> In an effort to select the most appropriate number of clusters in a
> mixture analysis I am comparing the expected and actual membership of
> individuals in various clusters using the Fisher?s exact test. I aim
> for the model with the lowest possible p-value, but I frequently get
> p-values below 2.2e-16 and therefore does not get exact p-values with
> standard Fisher?s exact tests in R.
>
> Does anybody know if there is a version of Fisher?s exact test in
> any package which can handle lower probabilities, or have other suggestions 
> as to how I can compare the probabilities?
>
> I am for instance comparing the following two:
>
> dat2<-matrix(c(29,0,29,0,12,0,18,0,0,29,0,16,0,19), nrow=2)
> fisher.test(dat2, workspace=30000000)
>
> dat3<-matrix(c(29,0,0,29,0,0,12,0,0,17,0,1,0,29,0,0,15,1,0,0,19),
> nrow=3)
> fisher.test(dat3, workspace=30000000)
>
> Which both result in p-value < 2.2e-16
>
> Kind regards, Søren
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche


More information about the R-help mailing list