# [R] OT: A test with dependent samples.

Rolf Turner r.turner at auckland.ac.nz
Wed Feb 11 19:51:22 CET 2009

```
Thank you ***VERY*** much.  That clarifies a lot of things which were
vaguely
knocking around in my mind but about which I could not get my
thoughts properly
organized.

Thanks again.

cheers,

Rolf Turner

On 11/02/2009, at 11:27 PM,
Meyners,Michael,LAUSANNE,AppliedMathematics wrote:

> Rolf,
>
> as you explicitly asked for a comment on your proposal: It is
> generally
> equivalent to McNemar's test and maybe even more appropriate
> because of
> the asymptotics involved in the chi-squared distribution, which might
> not be too good with n=12. In some more detail:
>
> McNemar's test basically considers the difference between the
> off-diagonal elements, normalized by their sum. You do the same,
> ignoring all (0,0) and (1,1) pairs (the latter do not occur in your
> example anyway). binom.test(12,12,alternative="greater") gives you the
> one-sided p-value.
>
> If, however, you do not use the exact binomial distribution (with
> n=12)
> but the normal approximation, you find E(X)=6 and VAR(X)=3 with X the
> number of sequences (0,1), and you observed x=12. This gives you a
> z-score of (12-6)/sqrt(3) and a corresponding one-sided p-value:
>
>> pnorm(6/sqrt(3), lower.tail=F)
>  0.0002660028
>
> Further, McNemar's test WITHOUT continuity correction gives
>
>> mcnemar.test(matrix(c(61,0,12,0),2,2), correct=F)
>
>         McNemar's Chi-squared test
>
> data:  matrix(c(61, 0, 12, 0), 2, 2)
> McNemar's chi-squared = 12, df = 1, p-value = 0.000532
>
> It comes as no surprise that the reported (two-sided!) p-value is
> exactly twice the (one-sided!) p-value from the binomial test with
> normal approximation:
>
>> pnorm(6/sqrt(3), lower.tail=F)*2
>  0.0005320055
>
> I do not want to stress the pros and cons of continuity
> corrections, but
> if neglected, you see that you get the same results (except that
> McNemar
> is generally two-sided), due to the relation of normal and chi-squared
> distribution.
>
> If you use the binomial test, you can forget about asymptotics and
> continuity correction, that's why I indeed consider you approach
> superior to the chi-squared approximation used in McNemar's test. But
> you might fail to find a reference for the exact approach...
>
>
> You briefly asked in a later mail testing for p=0. Indeed, _any_
> incident will disprove this hypothesis, and the p value reported by
>
>> binom.test(1,73,p=0)
>
>         Exact binomial test
>
> data:  1 and 73
> number of successes = 1, number of trials = 73, p-value < 2.2e-16
> alternative hypothesis: true probability of success is not equal to 0
> 95 percent confidence interval:
>  0.0003467592 0.0739763232
> sample estimates:
> probability of success
>             0.01369863
>
> is not wrong (as R just tells us it is BELOW a certain value), but
> could
> be refined by saying "p-value = 0". BTW, I am not sure what
> "p-value=TRUE" tells me, as derived from binom.test(0,73,p=0). I
> personally don't care about either, as to test H0: p=0, I would not
> use
> any software but rather stress some basic statistical theory.
>
>
> It remains the questions whether McNemar (or your proposal) is
> generally
> appropriate here. Like others, I have doubts, except maybe if
> observations were made on two subsequent days, on the second of which
> the drug was administered. How could you otherwise be sure that the
> increase in incidences it not due to the progression of cancer, say,
> which would be difficult to rule out. Also, from my experience and in
> contrast to you point of view, I'd rather assume it likely that a
> vet is
> much more reluctant to apply the new treatment to an already vomitting
> cat, your "baseline" with 0 out of 73 at least seems suspicious, and
> until proven wrong, I'd take the assumption that this is not by chance
> only.
> I understand that this is an observational study with no
> control/randomization, but concluding a side-effect from a
> statistically
> significant change in incidence rates from this study seems
> questionable
> to me. What I'd propose (and used in comparable situations) is to
> confine yourself on the confidence interval and let the investigator
> decide / discuss whether the lower bound of this is higher than
> what she
> would expect under control or alternative treatment. She needs to have
> some experience, if not historical data, and using a test or just this
> confidence interval, you will only get a hint on a possible side-
> effect,
> no formal proof (which is always difficult from observational
> studies).
> Discussing the CI would seem fairer and even stronger to me then. In
> other words: Mathematically, you'll get an appropriate test, while
> conceptually, I'm far from being convinced.
>
> Hope this makes sense,
> Michael
>
>
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org]
> On Behalf Of Rolf Turner
> Sent: Dienstag, 10. Februar 2009 22:33
> To: R-help Forum
> Subject: [R] OT: A test with dependent samples.
>
>
> I am appealing to the general collective wisdom of this list in
> respect
> of a statistics (rather than R) question.  This question comes to me
> from a friend who is a veterinary oncologist.  In a study that she is
> writing up there were 73 cats who were treated with a drug called
> piroxicam.  None of the cats were observed to be subject to vomiting
> prior to treatment; 12 of the cats were subject to vomiting after
> treatment commenced.  She wants to be able to say that the
> a ``significant''
> impact with respect to this unwanted side-effect.
>
> Initially she did a chi-squared test.  (Presumably on the matrix
> matrix(c(73,0,61,12),2,2) --- she didn't give details and I didn't
> pursue
> this.) I pointed out to her that because of the dependence --- same 73
> cats pre- and post- treatment --- the chi-squared test is
> inappropriate.
>
> So what *is* appropriate?  There is a dependence structure of some
> sort,
> but it seems to me to be impossible to estimate.
>
> After mulling it over for a long while (I'm slow!) I decided that a
> non-parametric approach, along the following lines, makes sense:
>
> We have 73 independent pairs of outcomes (a,b) where a or b is 0 if
> the
> cat didn't barf, and is 1 if it did barf.
>
> We actually observe 61 (0,0) pairs and 12 (0,1) pairs.
>
> If there is no effect from the piroxicam, then (0,1) and (1,0) are
> equally likely.  So given that the outcome is in {(0,1),(1,0)} the
> probability of each is 1/2.
>
> Thus we have a sequence of 12 (0,1)-s where (under the null
> hypothesis)
> the probability of each entry is 1/2.  Hence the probability of this
> sequence is (1/2)^12 = 0.00024.  So the p-value of the (one-sided)
> test
> is 0.00024.  Hence the result is ``significant'' at the usual levels,
> and my vet friend is happy.
>
> I would very much appreciate comments on my reasoning.  Have I made
> any
> goof-ups, missed any obvious pit-falls?  Gone down a wrong garden
> path?
>
> Is there a better approach?
>
> Most importantly (!!!): Is there any literature in which this approach
> is spelled out?  (The journal in which she wishes to publish will
> almost
> surely demand a citation.  They *won't* want to see the reasoning
> spelled out in the paper.)
>
> I would conjecture that this sort of scenario must arise reasonably
> often in medical statistics and the suggested approach (if it is
> indeed
> valid and sensible) would be ``standard''.  It might even have a name!
> But I have no idea where to start looking, so I thought I'd ask this
> wonderfully learned list.
>
> Thanks for any input.
>
> 	cheers,
>
> 		Rolf Turner
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...
> {{dropped:9}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help