[R] cor vs cor.test

Peter Ehlers ehlers at ucalgary.ca
Tue Jul 7 17:42:47 CEST 2009


?cor says that cor() can be applied to
  'numeric vector, matrix or data frame'

?cor.test requires
  'numeric vectors of data values'

So, what's your q?

As to na.action:
?cor.test makes no reference to na.action for the default method.
Looking at the code of cor.test.default shows that only complete
cases are used. The formula method does have an argument na.action
and it works just fine for me.
Try getOption('na.action') and you'll probably find that it is set
                   ^^^^^^
to 'na.omit'.

  -Peter Ehlers

Godmar Back wrote:
> Hi,
> 
> I am trying to use R for some survey analysis, and need to compute the
> significance of some correlations. I read the man pages for cor and
> cor.test, but I am confused about
> 
> - whether these functions are intended to work the same way
> - about how these functions handle NA values
> - whether cor.test supports 'use = complete.obs'.
> 
> Some example output may explain why I am confused:
> 
> -----------------------------------------------
> WORKS:
>> cor(q[[9]], q[[10]])
>                   perceivedlearningcurve
> overallimpression              0.7440637
> -----------------------------------------------
> 
> DOES NOT WORK:
>> cor.test(q[[9]], q[[10]])
> Error in `[.data.frame`(x, OK) : undefined columns selected
> -----------------------------------------------
> 
> (I assume that's because of R's generous type coercions.... does R
> have a "typeof" operator to learn what the type of q[[9]] is?)
> 
> -----------------------------------------------
> WORKS:
>> cor.test(q[[9]][,1], q[[10]][,1])
> 
>         Pearson's product-moment correlation
> 
> data:  q[[9]][, 1] and q[[10]][, 1]
> t = 12.9877, df = 136, p-value < 2.2e-16
> alternative hypothesis: true correlation is not equal to 0
> 95 percent confidence interval:
>  0.6588821 0.8104055
> sample estimates:
>       cor
> 0.7440637
> -----------------------------------------------
> 
> WORKS, but propagates NAs:
>> cor(q[[9]], q[[51]])
>                   usefulnessautodetectionbox_ord
> overallimpression                             NA
> -----------------------------------------------
> WORKS, and uses complete observations only
> 
>> cor(q[[9]], q[[51]], use="complete.obs")
>                   usefulnessautodetectionbox_ord
> overallimpression                      0.2859895
> -----------------------------------------------
> WORKS, apparently, but does not require 'use="complete.obs"' (!?)
> 
>> cor.test(q[[9]][,1], q[[51]][,1])
> 
>         Pearson's product-moment correlation
> 
> data:  q[[9]][, 1] and q[[51]][, 1]
> t = 3.1016, df = 108, p-value = 0.002456
> alternative hypothesis: true correlation is not equal to 0
> 95 percent confidence interval:
>  0.1043351 0.4491779
> sample estimates:
>       cor
> 0.2859895
> -----------------------------------------------
> 
> The help page for cor.test states that 'getOption('na.action')'
> describes the action taken on NAs:
> 
>> getOption("na.option")
> NULL
> -----------------------------------------------
> 
> No action is taken, yet cor.test appears to only use complete observations (!?)
> 
> Others believe that cor.test accepts 'use=complete.obs':
> http://markmail.org/message/nuzqeouqhbb7f6ok
> 
> --------------
> 
> Needless to say, this makes writing robust code very hard.
> 
> I'm wondering what the rationale for the inconsistencies between cor
> and cor.test is.
> 
> Thanks!
> 
>  - Godmar
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>




More information about the R-help mailing list