[R] cor vs cor.test

Godmar Back godmar at gmail.com
Tue Jul 7 19:14:36 CEST 2009


Thanks, Peter.

You're right, I mistyped and getOption('na.action') shows na.omit.

Perhaps my question was more commentary about my perceived lack of
rationale and orthogonality in R than it should have been. Presumably,
q[[i]] is a data frame and q[[i]][,1] is a numeric vector, so cor and
cor.test work differently. The interfaces for how to handle NAs
between the two functions are completely different. Why design things
this way, though.

 - Godmar

On Tue, Jul 7, 2009 at 11:42 AM, Peter Ehlers<ehlers at ucalgary.ca> wrote:
> ?cor says that cor() can be applied to
>  'numeric vector, matrix or data frame'
>
> ?cor.test requires
>  'numeric vectors of data values'
>
> So, what's your q?
>
> As to na.action:
> ?cor.test makes no reference to na.action for the default method.
> Looking at the code of cor.test.default shows that only complete
> cases are used. The formula method does have an argument na.action
> and it works just fine for me.
> Try getOption('na.action') and you'll probably find that it is set
>                  ^^^^^^
> to 'na.omit'.
>
>  -Peter Ehlers
>
> Godmar Back wrote:
>>
>> Hi,
>>
>> I am trying to use R for some survey analysis, and need to compute the
>> significance of some correlations. I read the man pages for cor and
>> cor.test, but I am confused about
>>
>> - whether these functions are intended to work the same way
>> - about how these functions handle NA values
>> - whether cor.test supports 'use = complete.obs'.
>>
>> Some example output may explain why I am confused:
>>
>> -----------------------------------------------
>> WORKS:
>>>
>>> cor(q[[9]], q[[10]])
>>
>>                  perceivedlearningcurve
>> overallimpression              0.7440637
>> -----------------------------------------------
>>
>> DOES NOT WORK:
>>>
>>> cor.test(q[[9]], q[[10]])
>>
>> Error in `[.data.frame`(x, OK) : undefined columns selected
>> -----------------------------------------------
>>
>> (I assume that's because of R's generous type coercions.... does R
>> have a "typeof" operator to learn what the type of q[[9]] is?)
>>
>> -----------------------------------------------
>> WORKS:
>>>
>>> cor.test(q[[9]][,1], q[[10]][,1])
>>
>>        Pearson's product-moment correlation
>>
>> data:  q[[9]][, 1] and q[[10]][, 1]
>> t = 12.9877, df = 136, p-value < 2.2e-16
>> alternative hypothesis: true correlation is not equal to 0
>> 95 percent confidence interval:
>>  0.6588821 0.8104055
>> sample estimates:
>>      cor
>> 0.7440637
>> -----------------------------------------------
>>
>> WORKS, but propagates NAs:
>>>
>>> cor(q[[9]], q[[51]])
>>
>>                  usefulnessautodetectionbox_ord
>> overallimpression                             NA
>> -----------------------------------------------
>> WORKS, and uses complete observations only
>>
>>> cor(q[[9]], q[[51]], use="complete.obs")
>>
>>                  usefulnessautodetectionbox_ord
>> overallimpression                      0.2859895
>> -----------------------------------------------
>> WORKS, apparently, but does not require 'use="complete.obs"' (!?)
>>
>>> cor.test(q[[9]][,1], q[[51]][,1])
>>
>>        Pearson's product-moment correlation
>>
>> data:  q[[9]][, 1] and q[[51]][, 1]
>> t = 3.1016, df = 108, p-value = 0.002456
>> alternative hypothesis: true correlation is not equal to 0
>> 95 percent confidence interval:
>>  0.1043351 0.4491779
>> sample estimates:
>>      cor
>> 0.2859895
>> -----------------------------------------------
>>
>> The help page for cor.test states that 'getOption('na.action')'
>> describes the action taken on NAs:
>>
>>> getOption("na.option")
>>
>> NULL
>> -----------------------------------------------
>>
>> No action is taken, yet cor.test appears to only use complete observations
>> (!?)
>>
>> Others believe that cor.test accepts 'use=complete.obs':
>> http://markmail.org/message/nuzqeouqhbb7f6ok
>>
>> --------------
>>
>> Needless to say, this makes writing robust code very hard.
>>
>> I'm wondering what the rationale for the inconsistencies between cor
>> and cor.test is.
>>
>> Thanks!
>>
>>  - Godmar
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>




More information about the R-help mailing list