[R] Fisher's Test 5x4 table

Gerrit Eichner Gerrit.Eichner at math.uni-giessen.de
Mon Aug 31 09:28:49 CEST 2015


Paul,

in addition to Peter's suggestion about the missing of theory you are also 
completely missing to explain what you mean by "[it] is not giving me the 
results for the calculations" or "[how] to get the results of the fisher 
test". They are there in the output of R's fisher.test() (if you have an 
idea about the theory).

And again:

> fisher.test( Trapz, simulate.p.value = TRUE, B = 1e5)

specifies enough arguments in the case of simulating to approximate the 
p-value since workspace (quoting from the help page) is "Only used for 
***non-simulated*** p-values [of] larger than 2 by 2 tables." (Similarly, 
control and hybrid are not needed either here.)

  Regards  --  Gerrit

---------------------------------------------------------------------
Dr. Gerrit Eichner                   Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de   Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109        http://www.uni-giessen.de/cms/eichner
---------------------------------------------------------------------


On Sun, 30 Aug 2015, paul brett wrote:

> Hi Gerrit,
>             I tried both of your suggestions and got the exact same thing.
> Fisher's Exact Test for Count Data with simulated p-value (based on 1e+05
> replicates)
>
> data:  Trapz
> p-value = 1e-05
> alternative hypothesis: two.sided
>
> I put in a few changes myself based on the details section on what should
> be used for a larger than 2x2 table, getting the exact same thing as
> before. I have removed or = 1, conf.int = TRUE. Added y = NULL, control =
> list(30) and changed simulate.p.value = TRUE.
>> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control =
> list(30), simulate.p.value = TRUE, B =1e5)
> isher's Exact Test for Count Data with simulated p-value (based on 1e+05
> replicates)
>
> data:  Trapz
> p-value = 1e-05
> alternative hypothesis: two.sided
>
>> fisher.test( Trapz, y = NULL, workspace = 200000, hybrid = TRUE,control =
> list(30), simulate.p.value = TRUE, B =1e7)
>
> Fisher's Exact Test for Count Data with simulated p-value (based on 1e+07
> replicates)
>
> data:  Trapz
> p-value = 1e-07
> alternative hypothesis: two.sided
>
>
> Dispite these chages, the changes equations is not giving me the results
> for the calculations. The changes I have made seem to satisfy what is in
> the details section on R, and I don't have the issue of workspace in R.
> What I do to get the results of the fisher test?
> Is there something simple that I am missing?
>
> Regards,
>             Paul
>
> On Fri, Aug 28, 2015 at 3:52 PM, Gerrit Eichner <
> Gerrit.Eichner at math.uni-giessen.de> wrote:
>
>> Paul,
>>
>> as the error messages of your first three attempts (see below) tell you -
>> in an admittedly rather cryptic way - your table or its sample size,
>> respectively, are too large, so that either the "largest (hash table) key"
>> is too large, or your (i.e., R's) workspace is too small, or your
>> hardware/os cannot allocate enough memory to calculate the p-value of
>> Fisher Exact Test exactly by means of the implemented algorithm.
>>
>> One way out of this is to approximate the exact p-value through
>> simulation, but apparently there occurred a typo in your (last) attempt to
>> do that (Error: unexpected '>' in ">").
>>
>>
>> So, for me the following works (and it should also for you) and gives the
>> shown output (after a very short while):
>>
>> Trapz <- as.matrix( read.table( "w.txt", head = T, row.names = "Traps"))
>>>
>>
>> set.seed( 20150828)   # For the sake of reproducibility.
>>> fisher.test( Trapz, simulate.p.value = TRUE,
>>>
>> +             B = 1e5)
>>
>>    Fisher's Exact Test for Count Data with simulated p-value (based on
>>    1e+05 replicates)
>>
>> data:  Trapz
>> p-value = 1e-05
>> alternative hypothesis: two.sided
>>
>>
>>
>> Or for a higher value for B if you are patient enough (with a computing
>> time of several seconds) :
>>
>> set.seed( 20150828)
>>> fisher.test( Trapz, simulate.p.value=TRUE, B = 1e7)
>>>
>>
>>    Fisher's Exact Test for Count Data with simulated p-value (based on
>>    1e+07 replicates)
>>
>> data:  Trapz
>> p-value = 1e-07
>> alternative hypothesis: two.sided
>>
>>
>>  Hth  --  Gerrit
>>
>> (BTW, you don't have to specify arguments (in function calls) whose
>> default values you don't want to change.)
>>
>>
>>
>>
>> On Fri, 28 Aug 2015, paul brett wrote:
>>
>> Hi Gerrit,
>>>             I spotted that, it was a mistake on my own part, it should
>>> read 1.trap.2.barrier. I have corrected it on the file attached.
>>>
>>> So I have done these so far:
>>>> fisher.test(Trapz, workspace = 200000, hybrid = FALSE, control = list(),
>>> or = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
>>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error in fisher.test(Trapz, workspace = 2e+05, hybrid = FALSE, control =
>>> list(),  :
>>>  FEXACT error 501.
>>> The hash table key cannot be computed because the largest key
>>> is larger than the largest representable int.
>>> The algorithm cannot proceed.
>>> Reduce the workspace size or use another algorithm.
>>>
>>> fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control = list(), or
>>>>
>>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
>>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error in fisher.test(Trapz, workspace = 2000, hybrid = FALSE, control =
>>> list(),  :
>>>  FEXACT error 40.
>>> Out of workspace.
>>>
>>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or
>>>>
>>> = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
>>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error in fisher.test(Trapz, workspace = 1e+08, hybrid = FALSE, control =
>>> list(),  :
>>>  FEXACT error 501.
>>> The hash table key cannot be computed because the largest key
>>> is larger than the largest representable int.
>>> The algorithm cannot proceed.
>>> Reduce the workspace size or use another algorithm.
>>>
>>>> fisher.test(Trapz, workspace = 2000000000, hybrid = FALSE, control =
>>>>
>>> list(), or = 1, alternative = "two.sided", conf.int = TRUE, conf.level =
>>> 0.95,simulate.p.value = FALSE, B = 2000)
>>> Error: cannot allocate vector of size 7.5 Gb
>>> In addition: Warning messages:
>>> 1: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>> 2: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>> 3: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>> 4: In fisher.test(Trapz, workspace = 2e+09, hybrid = FALSE, control =
>>> list(),  :
>>>  Reached total allocation of 6027Mb: see help(memory.size)
>>>
>>> fisher.test(Trapz, workspace = 1e8, hybrid = FALSE, control = list(), or =
>>> 1, alternative = "two.sided", conf.int = TRUE, conf.level =
>>> 0.95,simulate.p.value = TRUE, B = 1e5)
>>> Error: unexpected '>' in ">"
>>>
>>> So the issue could be perhaps that R cannot compute my sample as the
>>> workspace needed is too big? Is there a way around this? I think I have
>>> everything set out correctly.
>>> Is my only other alternative is to do a 2x2 fisher test for each of the
>>> variables?
>>>
>>> I attach on the pdf the Minitab result for the Chi squared test as proof
>>> (I
>>> know that getting very low p values are highly unlikely but sometimes it
>>> happens). Seeing is believing i suppose!
>>>
>>> Regards,
>>>             Paul
>>>
>>>
>>>
>>> On Fri, Aug 28, 2015 at 8:56 AM, Gerrit Eichner <
>>> Gerrit.Eichner at math.uni-giessen.de> wrote:
>>>
>>> Dear Paul,
>>>>
>>>> quoting the email-footer: "PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html and provide commented,
>>>> minimal, self-contained, reproducible code."
>>>>
>>>> So, what exactly did you try and what was the actual problem/error
>>>> message?
>>>>
>>>> Besides that, have you noted that two of you data rows have the same
>>>> name?
>>>>
>>>>
>>>> Have you read the online help page of fisher.test():
>>>>
>>>>  ?fisher.test
>>>>
>>>>
>>>> Have you tried anything like the following?
>>>>
>>>> W <- as.matrix( read.table( "w.txt", head = T)[-1])
>>>>
>>>> fisher.test( W, workspace = 1e8)
>>>>    # For workspace look at the help page, but it presumably
>>>>    # won't work because of your sample size.
>>>>
>>>>
>>>> set.seed( 20150828) # for reproducibility
>>>> fisher.test( W, simulate.p.value = TRUE, B = 1e5)
>>>>    # For B look at the help page.
>>>>
>>>>
>>>> Finally: Did Minitab really report "p > 0.001"? ;-)
>>>>
>>>>  Hth  --  Gerrit
>>>>
>>>>
>>>> Dear all,
>>>>
>>>>>            I am trying to do a fishers test on a 5x4 table on R
>>>>> statistics. I have already done a chi squared test using Minitab on this
>>>>> data set, getting a result of (1, N = 165.953, DF 12, p>0.001), yet
>>>>> using
>>>>> these results (even though they are excellent) may not be suitable for
>>>>> publication. I have tried numerous other statistical packages in the
>>>>> hope
>>>>> of doing this test, yet each one has just the 2x2 table.
>>>>>            I am struggling to edit the template fishers test on R to fit
>>>>> my table (as according to the R book it is possible, yet i cannot get it
>>>>> to
>>>>> work). The template given on the R documentation and R book is for a 2x2
>>>>> fisher test. What do i need to change to get this to work? I have
>>>>> attached
>>>>> the data with the email so one can see what i am on about. Or do i have
>>>>> to
>>>>> write my own new code to compute this.
>>>>>
>>>>>             Yours Sincerely,
>>>>>                                     Paul Brett



More information about the R-help mailing list