[R] Why two chisq.test p values differ when the contingency

Ravi Varadhan rvaradha at jhsph.edu
Tue Jul 15 23:02:05 CEST 2003


Hi Tao:

The P-values for 2x2 table are generated based on a random (discrete 
uniform distribution) sampling of all possible 2x2 tables, conditioning 
on the observed margin totals. If one of the cells is extremely small, 
as in your case, you get a big difference in P-values. Suppose, you 
changed the cell with value 1 to, say, 5 or 6, then the two P-values 
are nearly the same. However, I don't understand why they should be so 
different, since the set of all possible 2x2 tables will be the same in 
both cases. I would be interested in knowing how this happens.

Ravi.


----- Original Message -----
From: "Shi, Tao" <shidaxia at yahoo.com>
Date: Tuesday, July 15, 2003 4:37 pm
Subject: RE: [R] Why two chisq.test p values differ when the contingency

> Hi, Ted and Dennis:
> 
> Thanks for your speedy replies!  I don't think this happens just 
> randomly, rather, I'm thinking it may be due to the way chisq.test 
> function handles simulation.  Here shows why: (Ted, I think there 
> is an error in your code, "tx" should be t(x)  )
> 
> > x
>     [,1] [,2]
> [1,]  149  151
> [2,]    1    8
> > c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value
> > for(i in (1:20)){c2x<-c(c2x,chisq.test(x, simulate.p.value=T,
> +                        B=100000)$p.value)}
> > c2tx<-chisq.test(t(x), simulate.p.value=T, B=100000)$p.value
> > for(i in (1:20)){c2tx<-c(c2tx,chisq.test(t(x), simulate.p.value=T,
> +                         B=100000)$p.value)}
> > cbind(c2x,c2tx)
>          c2x    c2tx
> [1,] 0.03727 0.01629
> [2,] 0.03682 0.01662
> [3,] 0.03671 0.01665
> [4,] 0.03788 0.01745
> [5,] 0.03706 0.01646
> [6,] 0.03715 0.01728
> [7,] 0.03664 0.01683
> [8,] 0.03681 0.01720
> [9,] 0.03742 0.01758
> [10,] 0.03712 0.01685
> [11,] 0.03739 0.01615
> [12,] 0.03811 0.01653
> [13,] 0.03711 0.01673
> [14,] 0.03639 0.01678
> [15,] 0.03714 0.01719
> [16,] 0.03774 0.01780
> [17,] 0.03574 0.01707
> [18,] 0.03661 0.01705
> [19,] 0.03751 0.01711
> [20,] 0.03683 0.01718
> [21,] 0.03678 0.01653
> 
> 
> 
> ...Tao
> 
> ============================================================
> Ted.Harding at nessie.mcc.ac.uk wrote:
> On 15-Jul-03 Tao Shi wrote:
> >>x
> > [,1] [,2]
> > [1,] 149 151
> > [2,] 1 8
> >>t(x)
> > [,1] [,2]
> > [1,] 149 1
> > [2,] 151 8
> >>chisq.test(x, simulate.p.value=T, B=100000)
> > Pearson's Chi-squared test with simulated p-value (based on
> > 1e+05 replicates)
> > data: x
> > X-squared = 5.2001, df = NA, p-value = 0.03774
> > 
> >>chisq.test(t(x), simulate.p.value=T, B=100000)
> > Pearson's Chi-squared test with simulated p-value (based on
> > 1e+05 replicates)
> > data: t(x)
> > X-squared = 5.2001, df = NA, p-value = 0.01642
> 
> Possibly you may just have been unlucky, though the 0.03774 seems 
> large:
> c2x<-chisq.test(x, simulate.p.value=T, B=100000)$p.value
> for(i in (1:9)){c2x<-c(c2x,chisq.test(x, simulate.p.value=T,
> B=100000)$p.value)}
> c2tx<-chisq.test(tx, simulate.p.value=T, B=100000)$p.value
> for(i in (1:9)){c2tx<-c(c2tx,chisq.test(tx, simulate.p.value=T,
> B=100000)$p.value)}
> cbind(c2x,c2tx)
> c2x c2tx
> [1,] 0.01627 0.01720
> [2,] 0.01672 0.01690
> [3,] 0.01662 0.01669
> [4,] 0.01733 0.01656
> [5,] 0.01679 0.01777
> [6,] 0.01715 0.01769
> [7,] 0.01765 0.01769
> [8,] 0.01703 0.01740
> [9,] 0.01704 0.01708
> [10,] 0.01669 0.01655
> 
> sd(c2x)
> [1] 0.0003946715
> sd(c2tx)
> [1] 0.0004737099
> 
> Ted.
> 
> 
> -------------------------------------------------------------------
> -
> E-Mail: (Ted Harding) 
> Fax-to-email: +44 (0)870 167 1972
> Date: 15-Jul-03 Time: 21:00:04
> ------------------------------ XFMail -----------------------------
> -
> 
> 
> 
> 
> 
> ---------------------------------
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>




More information about the R-help mailing list