[R] stratified Wilcoxon available?

torsten@hothorn.de torsten at hothorn.de
Tue Aug 30 14:59:52 CEST 2005


On Tue, 30 Aug 2005, Peter Dalgaard wrote:

> Torsten Hothorn <Torsten.Hothorn at rzmail.uni-erlangen.de> writes:
>
> > On Mon, 29 Aug 2005, Peter Dalgaard wrote:
> >
> > > Torsten Hothorn <Torsten.Hothorn at rzmail.uni-erlangen.de> writes:
> > >
> > > > > Dear All,
> > > > >
> > > > > is there a stratified version of the Wilcoxon test (also known as van
> > > > > Elteren test) available in R?
> > > >
> > > > you can plug it together using the `coin' infrastructure (see the
> > > > examples in the manual and vignette).
> > >
> > > I managed to dig out our old code, and patched it up loosely to fit
> > > versions of R later than 0.62 (the trend test code still seems
> > > broken). Use at own risk. The usage is fairly straightforward:
> > >
> > > > SKruskal.test(y~g1|g2)
> > >
> > >         Kruskal-Wallis stratified rank sum test
> > >
> > > data:  y , group:  g1 , strata:  g2
> > > K = 3.1486, df = 3, p-value = 0.3693
> > >
> >
> > the conditional version of the above test can be computed as follows:
> >
> > R> library("coin")
> > R> set.seed(290875)
> > R> mydf <- data.frame(y = rnorm(90), x = gl(3, 30)[sample(1:90)], b = gl(3, 30))
> > R>
> > R> ### with global ranks, i.e., ranking without using the blocks
> > R> kruskal_test(y ~ x | b, data = mydf)
> >
> >         Asymptotic Kruskal-Wallis Test
> >
> > data:  y by groups 1, 2, 3
> >          stratified by b
> > T = 0.5242, df = 2, p-value = 0.7694
> >
> > R>
> > R> ### with _blockwise_ ranking and quadratic form of the test statistic
> > R> independence_test(y ~ x | b, data = mydf, ytrafo = function(data) trafo(data,
> > +                   numeric_trafo = rank, block = mydf$b), teststat = "quad")
> >
> >         Asymptotic General Independence Test
> >
> > data:  y by groups 1, 2, 3
> >          stratified by b
> > T = 0.3824, df = 2, p-value = 0.826
> >
> > R>
> > R> ### the same
> > R> SKruskal.test(y ~ x | b, data = mydf)
> >
> >         Kruskal-Wallis stratified rank sum test
> >
> > data:  y , group:  x , strata:  b
> > K = 0.3824, df = 2, p-value = 0.826
> >
> > R>
> > R>
> > R>
> > R> ### trend test (using scores 1, 2, 3)
> > R> independence_test(y ~ x | b, data = mydf, ytrafo = function(data) trafo(data,
> > +                   numeric_trafo = rank, block = mydf$b), teststat = "quad",
> > +                   scores = list(x = 1:3))
> >
> >         Asymptotic General Independence Test
> >
> > data:  y by groups 1 < 2 < 3
> >          stratified by b
> > T = 0.0264, df = 1, p-value = 0.871
> >
> > R>
> > R> ### hm.
> > R> SKruskal.test(y ~ x | b, data = mydf, trend = TRUE)
> > Error in SKruskal.test(y ~ x | b, data = mydf, trend = TRUE) :
> >         subscript out of bounds
> >
> > Best,
> >
> > Torsten
>
> Nice to see that the old code made sense.

Nice to see that the new code is working correctly :-)

> A bit surprising that it
> gives _exactly_ the same result as the blockwise ranking in coin...

why? Without ties, the conditional and unconditional versions of the tests
should have exactly the same result.

> Perhaps introduce some ties? (round(y,1) is usually effective).
>

this should generate differences, yes.

> The trend test is easily fixed: just spell "t value" without capital V
> as we do nowadays. This gives
>
>
> > SKruskal.test(y ~ x | b, data = mydf, trend = TRUE)
>
>         Kruskal-Wallis stratified rank sum trend test
>
> data:  y , group:  x , strata:  b , trend:  as.numeric(group)
> Z = -0.1624, df = 1, p-value = 0.871
>
> > SKruskal.test(y ~ x | b, data = mydf, trend = 1:3)
>
>         Kruskal-Wallis stratified rank sum trend test
>
> data:  y , group:  x , strata:  b , trend:  1 2 3
> Z = -0.1624, df = 1, p-value = 0.871
>
> (The df=1 is a bit misleading in this case...)
>

maybe report 0.1624^2 = 0.0264 as test statistic (same as with teststat
= "quad" in `coin')?

Best,

Torsten

>
> --
>    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907
>




More information about the R-help mailing list