[Rd] bug(?) in chisq.test
Spencer Graves
spencer.graves at pdf.com
Tue Mar 9 16:09:11 MET 2004
This is a message for whoever maintains "chisq.test": For an
outcome more extreme than 2000 simulations, a Monte Carlo p-value of "<
2.2e-16" was printed. Ripley said the proper p-value for such cases
should be 1/(B+1) = 1/2001. This can be easily fixed by adding
"if(PVAL==0)PVAL <- 1/(B+1)" right after the following line in the code
for chisq.test (in R 1.8.1 for Windows):
> PVAL <- sum(tmp$results >= STATISTIC)/B
Thanks for all your hard work for the R Project.
Best Wishes,
spencer graves
-------- Original Message --------
Subject: Re: [R] Monte Carlo p-value (was "question")
Date: Tue, 9 Mar 2004 08:17:02 +0000 (GMT)
From: Prof Brian Ripley <ripley at stats.ox.ac.uk>
To: Spencer Graves <spencer.graves at pdf.com>
CC: cfinet at ens-lyon.fr, r-help at stat.math.ethz.ch, kjetil at entelnet.bo
1/(B+1) is the significance level of the Monte Carlo test if the data
give the most extreme value. (E.g. Ripley, 1987, p. 171.) This is a
calculation, not a convention, and assumes a continuously distributed
statistic.
On Mon, 8 Mar 2004, Spencer Graves wrote:
> What is the standard convention for a Monte Carlo p-value when the
> observed outcome is more extreme than all simulations? The example
> provided by Cédric Fine produced a Monte Carlo p-value, according to
> Kjetil Halvorsen, of "2.2e-16 (based on 2000 replicates)". This seems
> inappropriate to me.
>
> By reading the code, I found that for this case,
>
> PVAL <- sum(tmp$results >= STATISTIC)/B
>
> where STATISTIC is the observed chi-square while "tmp$results" is a
> vector of length B of chi-squares from Monte Carlo simulated tables with
> the same marginals. Thus, PVAL ranges over seq(0, 1, length=(B+1)).
> For the observed table, presumably, PVAL = 0. The function "chisq.test"
> apparently returns an object of class "htest", and "print.htest" calls
> "format.pval(x$p.value, digits = digits)", for which "format.pval(0, 4)"
> is "< 2.2e-16".
>
> Can someone provide an appropriate reference or sense of the
> literature on the appropriate number to report for a Monte Carlo p-value
> when the observed is more extreme than all the simulations? A value of
> 0 or "< 2.2e-16" violates my sense of the logic of this situation. If
> the appropriate number is 0.5/B, then the line "PVAL <- sum(tmp$results
> >= STATISTIC)/B" could be followed immediately by something like the
> following:
>
> if(PVAL==0) PVAL <- 0.5/B
>
> Comments?
> Best Wishes,
> Spencer Graves
>
> kjetil at entelnet.bo wrote:
>
> >On 8 Mar 2004 at 16:38, cfinet at ens-lyon.fr wrote:
> >
> >
> >
> >>I do not manage to make a Fisher´s exact test with the next matrix :
> >>
> >>
> >>
> >
> >You can consider using chisq.test() with the argument sim=TRUE:
> >
> >
> >
> >>mat <- matrix(scan(), 7, 12, byrow=TRUE)
> >>
> >>
> >1: 1 3 0 1 2 9 0 0 2 5 8 6
> >13: 0 3 3 0 0 0 0 5 0 3 0 0
> >25: 0 0 0 0 0 10 0 0 0 0 10 0
> >37: 0 2 0 2 6 14 0 0 6 0 10 6
> >49: 0 5 0 0 0 7 0 0 0 2 8 0
> >61: 0 0 1 9 4 7 2 1 4 2 12 5
> >73: 0 6 0 0 0 0 0 0 0 5 1 3
> >85:
> >Read 84 items
> >
> >
> >
> >>chisq.test(mat, sim=TRUE)
> >>
> >>
> >
> > Pearson's Chi-squared test with simulated p-value (based
> > on 2000 replicates)
> >
> >data: mat
> >X-squared = 218.3366, df = NA, p-value = < 2.2e-16
> >
> >Kjetil Halvorsen
> >
> >
> >
> >
> >
> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
> >>[1,] 1 3 0 1 2 9 0 0 2 5 8 6
> >>[2,] 0 3 3 0 0 0 0 5 0 3 0 0
> >>[3,] 0 0 0 0 0 10 0 0 0 0 10 0
> >>[4,] 0 2 0 2 6 14 0 0 6 0 10 6
> >>[5,] 0 5 0 0 0 7 0 0 0 2 8 0
> >>[6,] 0 0 1 9 4 7 2 1 4 2 12 5
> >>[7,] 0 6 0 0 0 0 0 0 0 5 1 3
> >>
> >>but I do not understand why it does not work since I obtain the next
> >>error message :
> >>
> >>
> >>
> >>
> >>
> >>>fisher.test(enfin.matrix)
> >>>
> >>>
> >>Error in fisher.test(enfin.matrix) : Bug in FEXACT: gave negative key
> >>
> >>thank you for considering my application
> >>
> >>cédric finet
> >>
> >>______________________________________________
> >>R-help at stat.math.ethz.ch mailing list
> >>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide!
> >>http://www.R-project.org/posting-guide.html
> >>
> >>
> >
> >______________________________________________
> >R-help at stat.math.ethz.ch mailing list
> >https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-devel
mailing list