[R] Chi-squared test

Marc Schwartz MSchwartz at mn.rr.com
Fri Nov 25 02:03:17 CET 2005


On Thu, 2005-11-24 at 21:55 +0000, Ted Harding wrote:
> On 24-Nov-05 P Ehlers wrote:
> > Bianca Vieru- Dimulescu wrote:
> >> Hello,
> >> I'm trying to calculate a chi-squared test to see if my data are 
> >> different from the theoretical distribution or not:
> >> 
> >> chisq.test(rbind(c(79,52,69,71,82,87,95,74,55,78,49,60),
>                     c(80,80,80,80,80,80,80,80,80,80,80,80)))
> >> 
> >>       Pearson's Chi-squared test
> >> 
> >> data:  rbind(c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60),
> >>              c(80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80))
> >> X-squared = 17.6, df = 11, p-value = 0.09142
> >> 
> >> Is this correct? If I'm doing the same thing using Excel I obtained
> >> a different value of p.. (1.65778E-14)
> >> 
> >> Thanks a lot,
> >> Bianca
> > 
> > It would be unusual to have 12 observed frequencies all equal to 80.
> > So I'm guessing that you have a 12-category variable and want to
> > test its fit to a discrete uniform distribution. I assume that your
> > frequencies are
> > 
> > x <- c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60)
> > 
> > Then just use
> > 
> > chisq.test(x)
> > 
> > (see the help page).
> > 
> > (If those 80's are expected cell frequencies, they should sum to
> > sum(x) = 851.)
> > 
> > I don't know what Excel does.
> > 
> > Peter
> > 
> > Peter Ehlers
> > University of Calgary
> 
> I'm rather with Peter on this question! I've tried to infer what
> you're really trying to do.
> 
> My a-priori plausible hypothesis was that you have
> 
>   k<-12
> 
> independent observations which have equal expected values
> 
>   m<-rep(80,k)
> 
> and are observed as
> 
>   x<-c(79,52,69,71,82,87,95,74,55,78,49,60)
> 
> On this basis, a chi-squared test Sum((O-E)^2/E) gives
> 
>   C2<-sum(((x-m)^2)/m)
> 
> so C2 = 41.1375, and on this hypothesis the chi-squared would
> have k=12 degrees of freedom. Then:
> 
>   1-pchisq(C2,k)
> ## [1] 4.647553e-05
>
> which is nowhere near the 1.65778E-14 you report from Excel.
> Also, the result from Peter's chisq.test(x) is p = 0.0006468,
> even further away.

It's late on Turkey Day here, but shouldn't that be:

> 1 - pchisq(C2, k - 1)  # 11 df
[1] 2.282202e-05

which is what I get using OO.org's Calc 2.0 with the CHITEST function
using the two vectors as the observed (x) and expected (m) values. I
also get this result from Gnumeric 1.4.3 using the same CHITEST
function.

Using the CHIDIST function in OO.org's Calc:

=CHIDIST(41.1375;11)

I also get the same p value.

Lastly, I get the same results on my wife's computer using Excel 2002
and the CHITEST function, so Bianca may want to check for typos in the
Excel sheet, or the possibility that the wrong syntax was used in the
CHITEST function formula (ie. wrong cell range, etc.).

Lacking that, I too am confuzzled as to where the 1.65778E-14 comes
from.

HTH,

Marc Schwartz

<snip>




More information about the R-help mailing list