[R] chisq.test, basic question

Huntsinger, Reid reid_huntsinger at merck.com
Tue Jul 30 23:15:11 CEST 2002


My previous reply (below) uses "false positive" in a particularly misleading
way. I intended this to mean "incorrect rejection of the null hypothesis of
no association". I succumbed to the temptation to call a "rejection of the
null hypothesis of no association" a "positive" (cancelling a double
negative?), but as it is a rejection (of no matter what) I should have
called it a "negative". 

Reid Huntsinger

-----Original Message-----
From: Huntsinger, Reid [mailto:reid_huntsinger at merck.com]
Sent: Tuesday, July 30, 2002 12:07 PM
To: 'juli g. pausas'; r-help
Subject: RE: [R] chisq.test, basic question


The cells are interpreted as counts, so by scaling you're analyzing a
different experiment (one with fewer observations). So the chi-squared value
will change (the terms (O-E)^2/E in the statistic scale linearly ignoring
rounding and "Yates' continuity correction"). 

The chisq.test on the original data is a test of association. Conventionally
you decide ahead of time on a threshold for "false positives", say 5%, then
use the reported p-value to determine whether to accept or reject the null
hypothesis of no association. Had you chosen 5%, since the reported p-value
is smaller than 5%, you would reject, i.e., decide that association is
present.

Chisq.test is not really a measure of association. Your observation is a
nice illustration of why. There are many measures of association (e.g., odds
ratio); see for example Alan Agresti's "Categorical Data Analysis" for some
discussion. 

Reid Huntsinger

-----Original Message-----
From: juli g. pausas [mailto:juli at ceam.es]
Sent: Tuesday, July 30, 2002 12:12 PM
To: r-help
Subject: [R] chisq.test, basic question


Dear R-users,
I have a question, which I'm not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I've got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <- c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven't
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._


----------------------------------------------------------------------------
--
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named in this
message.  If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.

============================================================================
==

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._

----------------------------------------------------------------------------
--
Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named on this
message.  If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.

============================================================================
==


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named in this message.  If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.

==============================================================================

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list