[Rd] Numerical stability in chisq.test

Jan Motl yzan at volny.cz
Fri Dec 29 14:53:43 CET 2017


Hi,

there is also PR#8224, which seems to be relevant. I executed the following code:

## Modify the function
chisq.test2 <- edit(chisq.test) # Modify to use increasing order of sorting at line 57


## PR#8224 (patological contingency table)
m <- matrix(c(1,0,7,16),2,2);

# Original
original <- chisq.test(m, sim=T)$p.value
for(i in (1:2000)){original <- c(original, chisq.test(m, sim=T)$p.value)}

# Modified
modified <- chisq.test2(m, sim=T)$p.value
for(i in (1:2000)){modified <- c(modified, chisq.test2(m, sim=T)$p.value)}

# Evaluation
t.test(original, modified)


## PR#3486 (invariance to transposition)
x <- rbind(c(149, 151), c(1, 8))

# Original
c2x <- chisq.test(x, sim=T, B=100000)$p.value
for(i in (1:200)){c2x<-c(c2x,chisq.test(x, sim=T,B=100000)$p.value)}
c2tx <- chisq.test(t(x), sim=T, B=100000)$p.value
for(i in (1:200)){c2tx<-c(c2tx,chisq.test(t(x), sim=T, B=100000)$p.value)}
sum(abs(c2x-c2tx))

# Modified
mc2x <- chisq.test2(x, sim=T, B=100000)$p.value
for(i in (1:200)){mc2x <- c(mc2x, chisq.test2(x, sim=T, B=100000)$p.value)}
mc2tx <- chisq.test2(t(x), sim=T, B=100000)$p.value
for(i in (1:200)){mc2tx <- c(mc2tx, chisq.test2(t(x), sim=T, B=100000)$p.value)}
sum(abs(mc2x-mc2tx)) 

# Evaluation
t.test((c2x-c2tx), (mc2x-mc2tx))

on two computers:
	1) OS: OS X 10.11.6, x86_64, darwin15.6.0; Version: R version 3.4.2 (2017-09-28)
	2) OS: Windows XP, i386, mingw32; Version: R version 3.4.3 (2017-11-30)

On both computers, the increasing and decreasing order return approximately the same results. 

Best regards,
 Jan Motl

> My thoughts too. PR 3486 is about simulated tables that theoretically have STATISTIC equal to the one observed, but come out slightly different, messing up the simulated p value. The sort is not actually intended to squeeze the very last bit of accuracy out of the computation, just to make sure that the round-off affects equivalent tables in the same way. "Fixing" the code may therefore unfix PR#3486; at the very least some care is required if this is modified.  


	[[alternative HTML version deleted]]



More information about the R-devel mailing list