[R] rcorr.cens Goodman-Kruskal gamma

Frank E Harrell Jr f.harrell at vanderbilt.edu
Tue Mar 10 13:21:01 CET 2009


Kim Vanselow wrote:
> Thanks to David and Frank for the suggestions. With a 2-dimensional input rcorr.cens and John Baron's implementation works good. But I am not able to calculate gamma for a multivariate matrix
> 
> example: columns=species; rows=releves; the numbers are BB-values (ordinal scale; 1<3 but 3-1 is not necessarily 2)
> 
>    K. ceratoides S. caucasica A. tibeticum
> A1    3               1            1
> A2    0               3            2
> A3    1               1            0
> A4    2               2            0
> A5    0               3            2
> B1    1               1            1
> B2    4               3            1
> 
> I want to calculate a distance matrix with scale unit "Goodman-Kruskals gamma" (instead of classical euclidean, bray curtis, manhattan etc.) which I can use for hierachical cluster analysis (e.g. amap, vegan, cluster) in order to compare the different releves.
>   
> Further suggestions would be greatly appreciated,
> Thank you very much,
> Kim
> 
> 
> 
>  
> -------- Original-Nachricht --------
>> Datum: Mon, 09 Mar 2009 13:27:29 -0500
>> Von: Frank E Harrell Jr <f.harrell at vanderbilt.edu>
>> An: David Winsemius <dwinsemius at comcast.net>
>> CC: Kim Vanselow <Vanselow at gmx.de>, r-help at r-project.org
>> Betreff: Re: [R] rcorr.cens Goodman-Kruskal gamma
> 
>> David Winsemius wrote:
>>> I looked at the help page for rcorr.cens and was surprised that 
>>> function, designed for censored data and taking input as a Surv object, 
>>> was being considered for that purpose.  This posting to r-help may be of
>>> interest. John Baron offers a simple implementation that takes its input
>>> as (x,y):
>>>
>>> http://finzi.psych.upenn.edu/R/Rhelp02/archive/19749.html
>>>
>>> goodman <- function(x,y){
>>>   Rx <- outer(x,x,function(u,v) sign(u-v))
>>>   Ry <- outer(y,y,function(u,v) sign(u-v))
>>>   S1 <- Rx*Ry
>>>   return(sum(S1)/sum(abs(S1)))}
>>>
>>> I then read Frank's response to John and it's clear that my impression 
>>> regarding potential uses of rcorr.cens was too limited. Appears that you
>>> could supply a "y" vector to the "S" argument and get more efficient 
>>> execution.
>> Yes rcorr.cens was designed to handle censored data but works fine with 
>> uncensored Y.  You may need so specify Surv(Y) but first try just Y.  It 
>> would be worth testing the execution speed of the two approaches.
>>
>> Frank
>>
>> -- 
>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                       Department of Biostatistics   Vanderbilt University
> 
> Dear r-helpers!
> I want to classify my vegetation data with hierachical cluster analysis.
> My Dataset consist of Abundance-Values (Braun-Blanquet ordinal scale; ranked) for each plant species and relevé.
> I found a lot of r-packages dealing with cluster analysis, but none of them is able to calculate a distance measure for ranked data.
> Podani recommends the use of Goodman and Kruskals' Gamma for the distance. I found the function rcorr.cens (outx=true) of the Hmisc package which should do it.
> What I don't understand is how to define the input vectors x, y with my vegetation dataset. The other thing how I can use the output of rcorr.cens for a distance measure in the cluster analysis (e.g. in vegan or amap).
> Any help would be greatly appreciated,
> Thank you very much,
> Kim

A function related to that is Hmisc's varclus function which will use 
Spearman, Pearson, or Hoeffding indexes for similarity measures.
Frank

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list