[R] Fisher's Test 5x4 table

Gerrit Eichner Gerrit.Eichner at math.uni-giessen.de
Fri Aug 28 15:52:37 CEST 2015


Paul,

as the error messages of your first three attempts (see below) tell you - 
in an admittedly rather cryptic way - your table or its sample size, 
respectively, are too large, so that either the "largest (hash table) key" 
is too large, or your (i.e., R's) workspace is too small, or your 
hardware/os cannot allocate enough memory to calculate the p-value of 
Fisher Exact Test exactly by means of the implemented algorithm.

One way out of this is to approximate the exact p-value through 
simulation, but apparently there occurred a typo in your (last) attempt to 
do that (Error: unexpected '>' in ">").


So, for me the following works (and it should also for you) and gives the 
shown output (after a very short while):

> Trapz <- as.matrix( read.table( "w.txt", head = T, row.names = "Traps"))

> set.seed( 20150828)   # For the sake of reproducibility.
> fisher.test( Trapz, simulate.p.value = TRUE,
+             B = 1e5)

    Fisher's Exact Test for Count Data with simulated p-value (based on
    1e+05 replicates)

data:  Trapz
p-value = 1e-05
alternative hypothesis: two.sided



Or for a higher value for B if you are patient enough (with a computing 
time of several seconds) :

> set.seed( 20150828)
> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7)

    Fisher's Exact Test for Count Data with simulated p-value (based on
    1e+07 replicates)

data:  Trapz
p-value = 1e-07
alternative hypothesis: two.sided


  Hth  --  Gerrit

(BTW, you don't have to specify arguments (in function calls) whose 
default values you don't want to change.)



On Fri, 28 Aug 2015, paul brett wrote:

> Hi Gerrit,
>             I spotted that, it was a mistake on my own part, it should
> read 1.trap.2.barrier. I have corrected it on the file attached.
>
> So I have done these so far:
> > fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control = list(),
> or = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
> 0.95,simulate.p.value = FALSE, B = 2000)
> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control =
> list(),  :
>  FEXACT error 501.
> The hash table key cannot be computed because the largest key
> is larger than the largest representable int.
> The algorithm cannot proceed.
> Reduce the workspace size or use another algorithm.
>
>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(), or
> = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
> 0.95,simulate.p.value = FALSE, B = 2000)
> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control =
> list(),  :
>  FEXACT error 40.
> Out of workspace.
>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or
> = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
> 0.95,simulate.p.value = FALSE, B = 2000)
> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control =
> list(),  :
>  FEXACT error 501.
> The hash table key cannot be computed because the largest key
> is larger than the largest representable int.
> The algorithm cannot proceed.
> Reduce the workspace size or use another algorithm.
>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control =
> list(), or = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
> 0.95,simulate.p.value = FALSE, B = 2000)
> Error: cannot allocate vector of size 7.5 Gb
> In addition: Warning messages:
> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
> list(),  :
>  Reached total allocation of 6027Mb: see help(memory.size)
> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
> list(),  :
>  Reached total allocation of 6027Mb: see help(memory.size)
> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
> list(),  :
>  Reached total allocation of 6027Mb: see help(memory.size)
> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
> list(),  :
>  Reached total allocation of 6027Mb: see help(memory.size)
>
> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or =
> 1, alternative = "two.sided", conf.int = TRUE, conf.level =
> 0.95,simulate.p.value = TRUE, B = 1e5)
> Error: unexpected '>' in ">"
>
> So the issue could be perhaps that R cannot compute my sample as the
> workspace needed is too big? Is there a way around this? I think I have
> everything set out correctly.
> Is my only other alternative is to do a 2x2 fisher test for each of the
> variables?
>
> I attach on the pdf the Minitab result for the Chi squared test as proof (I
> know that getting very low p values are highly unlikely but sometimes it
> happens). Seeing is believing i suppose!
>
> Regards,
>             Paul
>
>
>
> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner <
> Gerrit.Eichner at math.uni-giessen.de> wrote:
>
>> Dear Paul,
>>
>> quoting the email-footer: "PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide commented,
>> minimal, self-contained, reproducible code."
>>
>> So, what exactly did you try and what was the actual problem/error message?
>>
>> Besides that, have you noted that two of you data rows have the same name?
>>
>>
>> Have you read the online help page of fisher.test():
>>
>>  ?fisher.test
>>
>>
>> Have you tried anything like the following?
>>
>> W <- as.matrix( read.table( "w.txt", head = T)[-1])
>>
>> fisher.test( W, workspace = 1e8)
>>    # For workspace look at the help page, but it presumably
>>    # won't work because of your sample size.
>>
>>
>> set.seed( 20150828) # for reproducibility
>> fisher.test( W, simulate.p.value = TRUE, B = 1e5)
>>    # For B look at the help page.
>>
>>
>> Finally: Did Minitab really report "p > 0.001"? ;-)
>>
>>  Hth  --  Gerrit
>>
>>
>> Dear all,
>>>            I am trying to do a fishers test on a 5x4 table on R
>>> statistics. I have already done a chi squared test using Minitab on this
>>> data set, getting a result of (1, N = 165.953, DF 12, p>0.001), yet using
>>> these results (even though they are excellent) may not be suitable for
>>> publication. I have tried numerous other statistical packages in the hope
>>> of doing this test, yet each one has just the 2x2 table.
>>>            I am struggling to edit the template fishers test on R to fit
>>> my table (as according to the R book it is possible, yet i cannot get it
>>> to
>>> work). The template given on the R documentation and R book is for a 2x2
>>> fisher test. What do i need to change to get this to work? I have attached
>>> the data with the email so one can see what i am on about. Or do i have to
>>> write my own new code to compute this.
>>>
>>>             Yours Sincerely,
>>>                                     Paul Brett
>>>
>>>



More information about the R-help mailing list