[R] Wilcoxon Test and Mean Ratios

Thu Sep 20 21:07:38 CEST 2012

On Fri, Sep 21, 2012 at 6:43 AM, avinash barnwal
<avinashbarnwal123 at gmail.com> wrote:
> Hi,
>
> http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test
>
> We can clearly see that null hypothesis is median different or not.
> One way of proving non difference is P(X>Y) = P(X<Y) where X and Y are
> ordered.

Avinash.  No.

Firstly, the Wikipedia link is for the WIlcoxon signed rank test,
which is a different test and so is irrelevant. Even if the
signed-rank test were the one being discussed, you are still
incorrect. The signed rank test is on the median of differences, not
the difference in medians.  These are not the same, and need not even
be in the same direction.

Secondly, it is easy to establish that the WIlcoxon rank sum test need
not agree with the ordering in  medians, just by looking at examples,
as Peter showed

Thirdly,  there is a well-known demonstration originally due to Brad
Efron, "Efron's non-transitive dice', which implies that the
Mann-Whitney U test (which *is* equivalent to the Wilcoxon rank-sum
test) need not agree with the ordering given by *any* one-sample
summary statistic.

In this case, assuming the sample sizes are not too small (which looks
plausible given the p-value), the question is what summary the
original poster want's to compare: the mean (in which case the t-test
is the only option) or some other summary.  It's not possible to work
this out from the distribution of the data, so we need to ask the
original poster.  With reasonably large sample sizes he can get a
permutation test and bootstrap confidence interval for any summary
statistic of interest, but for the mean these will just reduce to the
t-test.

Rank tests (apart from Mood's test for quantiles, which has different
problems) can really behave very strangely in the absence of
stochastic ordering, because without stochastic ordering there is no
non-parametric way to define the direction of difference between two
samples.  It's important to remember that all the beautiful theory for
rank tests was developed under the (much stronger) a location shift
model: the distribution can have any shape, but the shape is assumed
to be identical in the two groups.  Or, as one of my colleagues puts
it "you don't know whether the treatment raises or lowers the outcome,
but you know it doesn't change anything else".

Knowledgeable and sensible statisticians who like the Wilcoxon test
(Frank Harrell comes to mind) like it because they believe stochastic
ordering is a reasonable assumption in the problems they work in, not
because they think you can do non-parametric testing in its absence.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland