[R] A questionb about the Wilcoxon signed rank test

Thomas Lumley tlumley at u.washington.edu
Mon Apr 5 19:07:51 CEST 2010



The problem is that your data contains ties, which mess up the nice theory and result in different people using different approximations.

I don't know where your z-statistic formula comes from, but you can find the one R uses by looking at the source code in stats:::wilcox.test.default.

To see that R's z-statistic approximation is better than yours, try breaking the ties randomly and using exact=TRUE.
     wilcox.test(endprice0+rnorm(length(endprice0),s=1e-10),endprice1,paired=TRUE,exact=TRUE)
You will find that the p-values agree fairly well with R's 0.88.

         -thomas


On Mon, 5 Apr 2010, hix li wrote:

> Hi guys,
> ?
> I have two data sets of prices: endprice0, endprice1
> ?
> I use the Wilcox test:
> ?
> wilcox.test(endprice0, endprice1, paired = TRUE, alternative = "two.sided",? conf.int = T, conf.level = 0.9)
> ?
> The result is with V = 1819, p-value = 0.8812.
> ?
> Then I calculated the z-value of the test: z-value = -2.661263. The corresponding p-value is: p-value = 0.003892, which is different from the p-value computed in the Wilcox test, I am using the following steps to compute the z-value:
> ?
> diff = c(endprice0 - endprice1)
> diffNew = diff[diff !=0]
> diffNew.rank = rank(abs(diffNew))
> diffNew.rank.sign <-? diffNew.rank? *? sign(diffNew)
> ranks.pos <- sum(diffNew.rank.sign[diffNew.rank.sign >0]) = 1819
> ranks.neg <- -sum(diffNew.rank.sign[diffNew.rank.sign<0]) = 1751
> ?
> v = ranks.neg
> n = 100
> z= (v - n *(n+1)/4)/sqrt(n*(n+1)*(2*n+1)/24) = -2.661263
>
>
> Which p-value should I take for the Wilcox test then?
> ?
> Hix
> ?
> the data sets used in my?test?are:
>
> endprice0 = c(136.3800, 134.8500, 350.7500, 18.8400, 0.0000, 0.0600, 159.1900, 242.5600, 0.0400, 289.9000, 0.0000, 42.6100, 275.9500, 76.6200, 36.6400, 0.0000, 81.5900, 179.3600, 86.2200, 210.8000, 118.7200, 45.5800, 98.1900, 137.0300, 47.7900, 123.7700, 23.2400, 0.0400, 130.2300, 0.0400, 0.0000, 130.3800, 150.7600, 0.5900, 277.3000, 166.0100, 0.0400, 71.9400, 80.1300, 162.8800, 85.0500, 125.4400, 138.0600, 0.0600, 140.6300, 100.9700, 0.0000, 0.0400, 213.7300, 86.9200, 294.8200, 0.0400, 0.0000, 239.2100, 0.0000, 13.7700, 95.5300, 0.0400, 146.7200, 0.0000, 0.00, 121.57, 68.23, 5.31, 0.04, 96.31, 206.02, 313.39, 92.34, 31.64, 118.71, 499.6, 0, 129.04, 106.88, 183.92, 50.42, 0, 0.04, 0.04, 1.57, 355.56, 81.19, 327.17, 151.18, 0, 0, 125.03, 0, 0.04, 132.01, 0, 0, 11.49, 23, 13.46, 326.64, 198.19, 114.22, 79.53)
> ?
> endprice1 = c(138.9300, 131.9700, 300.4700, 0.0000, 0.0000, 0.2200, 159.6300, 277.9100, 0.0000, 328.9700, 0.0000, 40.5100, 270.1000, 52.8000, 39.3800, 0.0400, 79.7100, 110.5600, 41.1600, 224.6600, 123.8800, 53.2700, 96.1500, 67.2800, 40.7300, 99.4900, 20.4900, 0.0400, 126.1000, 0.0000, 1.3700, 140.6500, 165.7200, 0.0000, 314.4200, 207.7400, 0.0400, 76.9300, 75.8000, 184.9100, 83.3700, 139.5300, 157.0500, 0.0000, 147.5900, 105.2800, 0.0000, 0.0000, 207.3000, 74.1100, 288.3900, 0.0400, 0.0000, 213.7200, 0.0400, 14.8300, 53.7000, 0.0400, 150.0800, 0.0000, 0, 123.73, 68.01, 9.52, 0, 111.86, 249.69, 354.18, 98, 31.3, 117.54, 455.32, 1.06, 127.92, 114.51, 173.85, 53.22, 0, 0, 0, 0.31, 376.69, 69.43, 278.8, 147.11, 0.04, 0, 120.05, 0, 0.04, 132.97, 0, 0, 9.98, 28.85, 13.77, 295.17, 191.54, 126.44, 84.83)
>
>
> ?
>
>
>      __________________________________________________________________
> Make your browsing faster, safer, and easier with the new Internet Explorer[[elided Yahoo spam]]
> com/ca/internetexplorer/
> 	[[alternative HTML version deleted]]
>
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list