# [R] nonparametric multiple sample comparison

Liaw, Andy andy_liaw at merck.com
Mon Apr 26 02:06:29 CEST 2004

```Have you looked at the npmc package on CRAN?

HTH,
Andy

> From: aolinto_r at bignet.com.br
>
> Hello all,
>
> Here goes one of my first functions.
>
> I want to make a nonparametric multiple sample comparison
> with unequal sample
> sizes (see Zar's Biostatistical Analysis, 3rd. Ed., pg. 201
> Example 10.11, pg.
> 288 Example 11.10). In the real world, I want to compare
> samples of fish
> length captured with different fishing gears.
>
> After using the Kruskal-Wallis test I want to check the
> differences between
> each two samples.
>
> I wrote the function multcomp (see below) and it's working OK
> but I still have
> some doubts. To use it one must have a two-column dataframe
> with groups and
> measurements.
>
> 1) In line 20 (Results <- data.frame...) I create a dataframe
> to store the final
> results. It has a fixed row number but it would be better to
> have a variable
> number, i.e., the number of combination of the groups taken 2
> by 2 (4 groups =
> 5 combinations, 6 groups = 15 combinations). Which function
> in R returns the
> number of combinations? I couldn't find it!
>
> 2) In lines 35 to 39 I print the result tables. How can I
> make a "clean"
> print, without row numbers and quotation marks?
>
> 3) Also I'd like to calculate the critical values "q" and
> p-values but I
> couldn't find the distribution indicated in Zar's Table B.15
> (App107) "Critical Values of Q for Nonparametric Multiple
> Comparison Testing".
> Is it possible to calculate these numbers in R?
>
> Thanks in advance for any help or comments to improve this function.
>
> Best regards,
>
> Antonio Olinto
> Sao Paulo Fisheries Institute
> Brazil
>
> multcomp <- function(DataSet) {
> dat.multcomp <- DataSet
> names(dat.multcomp) <- c("VarCat","VarNum")
> attach(dat.multcomp)
> dat.multcomp\$Rank <- rank(VarNum)
> attach(dat.multcomp)
> RankList <- aggregate(Rank,list(Rank=Rank),FUN=length)
> t <- length(RankList\$Rank)
> st <- 0
> for (i in 1:t) if (RankList[i,2]>1)
> st<-st+(RankList[i,2]^3-RankList[i,2])
> LevCat <- levels(dat.multcomp\$VarCat)
> NLevCat <- aggregate(VarCat,list(LevCat=VarCat),FUN=length)
> RLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=sum)
> MLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=mean)
> SampleSummary <-
> data.frame(LevCat,RLevCat[,2],NLevCat[,2],MLevCat[,2])
> names(SampleSummary)<-c("Samples","RSum","N","RMean")
> SampleSummary <-
> SampleSummary[order(SampleSummary\$RMean,decreasing=T),]
> NCat <- length(LevCat)
> N <- length(dat.multcomp\$VarCat)
> Results <- data.frame(rep(NA,6),rep(NA,6),rep(NA,6),rep(NA,6))
> names(Results) <- c("Comparison","Difference","SE","Q")
> l <- 1
> for (i in 1:(NCat-1)) {
>    for (j in NCat:(i+1)) {
>    SE <- sqrt(((N*(N+1)/12)-(st/(12*(N-1))))*((1/SampleSummary[i,3])+
> (1/SampleSummary[j,3])))
>    Dif <- SampleSummary[i,4]-SampleSummary[j,4]
>    Q=Dif/SE
> Results[l,1] <- paste(SampleSummary[i,1],"vs",SampleSummary[j,1])
> Results[l,2] <- round(Dif,4)
> Results[l,3] <- round(SE,4)
> Results[l,4] <- round(Q,4)
> l <-l+1
>    }
> }
> print("Sample summary ranked by mean ranks")
> print(SampleSummary)
> print("")
> print("Table of multiple comparisons")
> print(Results)
> }
>
> ===========
>
> Data example taken from Zar (Biostatiscical Analysis):
>
> dat.limno <- data.frame(scan(what = list (Pound=" ", pH=0), sep=" "))
> 1 7.68
> 1 7.69
> 1 7.7
> 1 7.7
> 1 7.72
> 1 7.73
> 1 7.73
> 1 7.76
> 2 7.71
> 2 7.73
> 2 7.74
> 2 7.74
> 2 7.78
> 2 7.78
> 2 7.8
> 2 7.81
> 3 7.74
> 3 7.75
> 3 7.77
> 3 7.78
> 3 7.8
> 3 7.81
> 3 7.84
> 4 7.71
> 4 7.71
> 4 7.74
> 4 7.79
> 4 7.81
> 4 7.85
> 4 7.87
> 4 7.91
>
> kruskal.test(pH~Pound)
>
>         Kruskal-Wallis rank sum test
>
> data
> Kruskal-Wallis chi-squared = 11.9435, df = 3, p-value = 0.007579
>
> multcomp(dat.limno)
>
>  "Sample summary ranked by mean ranks"
>   Samples  RSum N    RMean
> 3       3 145.0 7 20.71429
> 4       4 163.5 8 20.43750
> 2       2 132.5 8 16.56250
> 1       1  55.0 8  6.87500
>  ""
>  "Table of multiple comparisons"
>   Comparison Difference     SE      Q
> 1     3 vs 1    13.8393 4.6923 2.9493
> 2     3 vs 2     4.1518 4.6923 0.8848
> 3     3 vs 4     0.2768 4.6923 0.0590
> 4     4 vs 1    13.5625 4.5332 2.9918
> 5     4 vs 2     3.8750 4.5332 0.8548
> 6     2 vs 1     9.6875 4.5332 2.1370
>
>
> -------------------------------------------------
> WebMail Bignet - O seu provedor do litoral
> www.bignet.com.br
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help