[R] 3 little questions
Marc Schwartz
MSchwartz at medanalytics.com
Mon Feb 2 07:01:55 CET 2004
On Sun, 2004-02-01 at 20:22, Gabor Grothendieck wrote:
> See ?cor.test
I'll stand to be corrected here, but I do not believe that cor.test()
will work for Kendall's W, since it can only handle two variables.
Kendall's W is designed for >= 3 variables as a generalization of
rho/tau and Friedman's test.
There are some suggestions I have seen that estimations of Kendall's W
can be done by using the average of multiple pairwise Kendall's tau
across the variables, with two formula variations depending upon whether
there is an even or odd number of variables AND in the case of no ties.
I have also seen a suggestion that the R^2 from a one-way ANOVA yields
an approximation of Kendall's W. In fact, the SAS macro I reference
below uses this approach.
However, each of the aforementioned approaches can vary from W and so
carries various caveats under certain conditions.
I have not seen anything searching on r-help or CRAN for Kendall's W and
I have not coded it myself. However, I did find one reference on the
s-news list at:
http://www.biostat.wustl.edu/archives/html/s-news/2001-03/msg00197.html
and I also located a SAS Macro called %MAGREE at:
http://ewe3.sas.com/techsup/download/stat/magree.html
If you are familiar with SAS, that might be helpful as well.
Another reference which has various formulas and worked examples for W
is:
Nonparametric Measures of Association
Jean Dickson Gibbons
Sage, 1993
With respect to gamma, I have worked through a methodology to calculate
this and some other measures in R, that I have planned to add to the
CrossTable() function in the gregmisc package. Unfortunately, I have not
yet completed the coding for several of the measures and the associated
ASE's and p values due to lack of time.
I have done gamma however and the code is below, starting with the
functions to compute concordant and discordant pairs. These approaches
(using a cross-tabulation and matrix partitioning) can save a fair
amount of time, if the number of "unique pairs" in the data is
substantially less than the total number of pairs.
# Calculate CONcordant Pairs in a table
# cycle through x[r, c] and multiply by
# sum(x elements below and to the right of x[r, c])
# x = table
concordant <- function(x)
{
# get sum(matrix values > r AND > c)
# for each matrix[r, c]
mat.lr <- function(r, c)
{
lr <- x[(r.x > r) & (c.x > c)]
sum(lr)
}
# get row and column index for each
# matrix element
r.x <- row(x)
c.x <- col(x)
# return the sum of each matrix[r, c] * sums
# using mapply to sequence thru each matrix[r, c]
sum(x * mapply(mat.lr, r = r.x, c = c.x))
}
# Calculate DIScordant Pairs in a table
# cycle through x[r, c] and multiply by
# sum(x elements below and to the left of x[r, c])
# x = table
discordant <- function(x)
{
# get sum(matrix values > r AND < c)
# for each matrix[r, c]
mat.ll <- function(r, c)
{
ll <- x[(r.x > r) & (c.x < c)]
sum(ll)
}
# get row and column index for each
# matrix element
r.x <- row(x)
c.x <- col(x)
# return the sum of each matrix[r, c] * sums
# using mapply to sequence thru each matrix[r, c]
sum(x * mapply(mat.ll, r = r.x, c = c.x))
}
# Calculate Goodman-Kruskal gamma
# x = table
calc.gamma <- function(x)
{
c <- concordant(x)
d <- discordant(x)
gamma <- (c - d) / (c + d)
gamma
}
Here is an example of use. Keep in mind that x is the cross tabulation
of two vectors of measures, in this example, yielding a 3 x 3 table:
> x <- matrix(c(70, 10, 27, 85, 134, 60, 15, 41, 100), ncol = 3)
> x
[,1] [,2] [,3]
[1,] 70 85 15
[2,] 10 134 41
[3,] 27 60 100
> calc.gamma(x)
[1] 0.57045
If you have any questions on the above, let me know.
Hope this helps.
Marc Schwartz
On Sun, 2004-02-01 at 18:53, Siegfried.Macho wrote:
> Dear R-helpers,
>
> 3 questions:
> 1. Is there a package that contains a routine for computing Kendall's W
> (coefficient of concordance), with and without ties ?
> 2. Is there a package that contains a routine for computing Goodman' s Gamma.
> 3. I there a simple method for computing the number of ties as well as
> their lengths within a vector fo ranks,
> e.g.
> >r1 <- rank(c(1, 3, 2, 3, 3, 2, 4))
>
> gives:
>
> [1] 1.0 5.0 2.5 5.0 5.0 2.5 7.0
>
> which contains 2 ties with length 2 and 3.
More information about the R-help
mailing list