[R] Bootstrap or Wilcoxons' test?

Sat Feb 14 18:47:02 CET 2009

On Feb 14, 2009, at 3:23 AM, Thomas Lumley wrote:

> On Fri, 13 Feb 2009, David Winsemius wrote:
>
>> I must disagree with both this general characterization of the  
>> Wilcoxon test and with the specific example offered. First, we  
>> ought to spell the author's correctly and then clarify that it is  
>> the Wilcoxon rank-sum test that is being considered. Next, the WRS  
>> test is a test for differences in the location parameter of  
>> independent samples conditional on the samples having been drawn  
>> from the same distribution. The WRS test would have no  
>> discriminatory power for samples drawn from the same distribution  
>> having equal location parameters but only different with respect to  
>> unequal dispersion. Look at the formula, for Pete's sake. It  
>> summarizes differences in ranking, so it is in fact designed NOT to  
>> be sensitive to the spread of the values in the sample. It would  
>> have no power, for instance, to test the variances of two samples,  
>> both with a mean of 0, and one having a variance of 1 with the  
>> other having a variance of 3.  One can think of the WRS as a test  
>> for unequal medians.
>>
>
> One can, and it may be helpful to do so, as long as one knows it  
> isn't actually true. Unfortunately, some text books claim or  
> strongly imply it is true.

Yes. I have been corrected on that point before, which was why a chose  
the words I did. Doing a Google search on "derivation wilcoxon rank- 
sum test", the first hit is to a text "Introductory Biostatistics" by  
Le that is an example of such a text ... and many others further down  
the hit list.

> To make the test consistent for differences in the median you have  
> to know in advance that the distributions differ only by a location  
> shift, and then it is also consistent for differences in mean (or in  
> any other location parameter).

That is a typical assumption in the derivation of sampling  
distributions of the WRS W-statistic, is it not?

Troendle's article in Statistics and Medicine 18, 2763-2773 (1999)  
(would only be available to subscribers and libraries):
http://www3.interscience.wiley.com.online.uchc.edu/journal/66002289/abstract

An interesting on-line accessible discussion by O'Brien and Castellanoe:
http://www.amstat.org/sections/SRMS/Proceedings/y2005/Files/JSM2005-000930.pdf

Googling also brought up a Univ Of Minn website that has r scripts  
illustrating permutation tests (including WRS) from Hollander and  
Wolfe and a page for the WRS:

http://www.stat.umn.edu/geyer/old/5601/examp/perm.html

http://www.stat.umn.edu/geyer/5601/examp/ranksum.html#test

> Also, the operating characteristics aren't particularly similar to a  
> real test for medians, which has pretty low efficiency at the Normal  
> location-shift model (2/pi, IIRC) and is much more sensitive to ties  
> in the data.

My memory from Conover and Iman (only having seen the first edition)  
was that the Pittman efficiency of the WRS in the Gaussian case of  
unequal means was around 85% relative to the t-test. I suppose the  
choice of a central measure for reporting ought to be based on the  
purposes of investigation. If one is planning classification, and the  
distributions were skewed, then the median might be preferable because  
it is less subject to sampling effects:

 > var( apply( sapply(1:500, function(x) rlnorm(20)), 2, median))
[1] 0.08123678
 >
 >
 > var( apply( sapply(1:500, function(x) rlnorm(20)), 2, mean))
[1] 0.2168887

Thank you for the clarification.

-- 
David Winsemius

>
>
> And I could go on and on about non-transitivity, but I won't. Anyone  
> who is interested can Google for 'Efron dice'.
>
>       -thomas
>
>
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle
>
>