[R] Correlation question

Stephane Vaucher vauchers at iro.umontreal.ca
Thu Sep 9 21:30:50 CEST 2010


Hi everyone,

Thanks for the help.

On Thu, 9 Sep 2010, Peter Ehlers wrote:

> The first thing to do when you get results that you don't expect is
> to check the help page. The page for cor clearly states that its
> input is to a *numeric* vector, matrix or data frame (my emphasis).
> I would not be happy if R simply ignored non-numeric data. After all,
> it's trivial to ensure that you feed only numeric data to cor().

Indeed, the documentation states that it takes a numeric input. It 
does not state how it would react to an inappropriate input type. That's 
why I expected either to produce an error message or accurate results. I did 
not expect an incorrect result. I should not have assume that my 
expectations would be correct.

> Having said that, I guess others have found cor() problematic when
> non-valid input is supplied and so R now (as of 2.11.0) issues an
> error message that "'x' must be numeric". You should always check the
> latest released version to see if changes have been made. The NEWS
> file for 2.11.0 contains this:
>  cor() and cov() now test for misuse with non-numeric
>  arguments, such as the non-bug report PR#14207.
> You're doing the right thing by asking here first before reporting.
> It would definitely not be a good idea to report a (non-)bug
> in an outdated version of R.

Since my manipulations were simple, I assumed that others would have 
observed the same behaviour. In any case, I'm happy that the function 
checks the respect of the preconditions preconditions. Otherwise, it would 
have been good to add to the documentation and state that when there are 
non-numeric data, cor() can compute garbage.

cheers,
Stephane

>  -Peter Ehlers
>
> [rest snipped; not relevant to my comments.]
>



More information about the R-help mailing list