[R] details of cor function

David Parkhurst parkhurs at imap.iu.edu
Fri Mar 7 16:38:51 CET 2014


Thank you for your response.  The first part of my question was meant to 
ask "how do I actually find the source code?"  I tried to find that, 
without success.

As for my comfort with a method that gives variable answers, I've 
experimented by running 100 cases and take the average.  When I've done 
that on six different datasets from my real data with cor and 
method=kendall, the mean tau from 100 jittering cases has been 5% to 10% 
lower than for the one call without jittering.  Given the many ties and 
zeroes in my data, I'm inclined to think the mean value with jittering 
is likely to be a better statistic.  But I don't know.

David

On 3/7/2014 10:25 AM, Greg Snow wrote:
> You could run the cor function on a small dataset where you know the
> values of tau-a and/or tau-b (either because you hand computed them,
> or found an example on the internet showing the difference), that
> would give some good evidence as to which is used.
>
> Or you could look at the source code, R is open source afterall.
>
> On the jittering question: are you comfortable with a method that
> would give a different answer every time you run it?
>
> On Thu, Mar 6, 2014 at 9:41 PM, David Parkhurst <parkhurs at imap.iu.edu> wrote:
>> How can I find out whether the cor function with method="Kendall" computes
>> Kendall's tau-a or tau-b.  I understand that tau-b deals better with ties,
>> and I'm wanting to look for correlation in two variables that have lots of
>> ties (especially lots of zeroes for one of them).  The information provided
>> by ?cor does not specify which is computed.
>> And a question: am I better off to jitter the variables before computing
>> tau, given the many ties?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>




More information about the R-help mailing list