[R] Low level algorithm conrol in Fisher's exact test

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Nov 10 11:32:56 CET 2005


Uwe Ligges <ligges at statistik.uni-dortmund.de> writes:

> Kihwang Lee wrote:
> 
> > Hi folks,
> > 
> > Forgive me if this question is a trivial issue.
> > 
> > I was doing a series of Fishers' exact test using the fisher.test
> > function in stats package.
> > Since the counts I have were quite large (c(64, 3070, 2868, 4961135)), R
> > suggested me to use
> > *other algorithms* for the test which can be specified through the
> > 'control' argument of the
> > fisher.test function as I understood. But where can I find other
> > algorithms that I can use?
> > I hoped I could find relevant information in the manual but could not.
> > 
> > Can anybody help me out there?
> 
> What about a chisq.test? And honestly, I know the answer before 
> calculating anything ....

Actually, chisq.test complains that the expected values are too low...
I.e. you expected less than 5 and got 64! So the chisquare
approximation might not be perfect, but p < 2e-16 should be close
enough for jazz.

There's a buglet in the internal FEXACT code that causes it to
allocate a workspace that is way too big for cases like this. If you
really want to know what the p value is, phyper() is less sensitive:

> phyper(63,2932,4964205,3134,lower=FALSE)
[1] 4.512776e-74

(and in cases where one group is much larger than the other, you're
not far off by assuming that the probability in that group is known,
leading to a binomial test:

> binom.test(64,3134,p=2868/4961135)$p.value
[1] 2.368985e-74
)

The control= argument is not too well documented, but according to my
reading of the code, it is only used to set the "mult" argument to
.C("fexact", ...) and has no effect on the current issue.

Actually, the fexact C code is only used if or=1 (the default), so
another way out is

> fisher.test(M,or=1+1e-15)$p.value
[1] 4.512776e-74
> fisher.test(M,or=1-1e-15)$p.value
[1] 4.512776e-74

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907




More information about the R-help mailing list