[R] strange behavior of cor() with pairwise.complete.obs

Daniel Malter daniel at umd.edu
Thu Jan 3 01:46:18 CET 2008


1. In the case in which x and y are both vectors not matrices, yes.
Otherwise it returns matrices for both, of course.
2. As Peter wrote, yup. - I guess it is something that should be fixed. That
would be more consistent and the Pearson solution makes sense.

If you need a quick-and-dirty fix for your problem, I would suggest trying
to loop over the column indices i and j of your two matrices and compute the
correlations individually:

cor(x[,i],y[,j],use="p",method="k")

Daniel



-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Peter Dalgaard
Gesendet: Wednesday, January 02, 2008 6:22 PM
An: Hilmar Berger
Cc: r-help at r-project.org
Betreff: Re: [R] strange behavior of cor() with pairwise.complete.obs

Hilmar Berger wrote:
> Sorry,
>
> I obviously did not state clearly what the problem is (thanks Daniel):
>
> 1. minor problem: cor() does return different types of variables for 
> methods "kendall" and pearson (matrix vs. scalar) when 
> pairwise.complete.obs is selected.
>
> 2. major problem: cor() does return with an error if both x and y are 
> matrices with method="kendall" when pairwise.complete.obs is selected 
> and one column of one of the two matrices is completely NA.
> This does not happen for method "pearson".
>
> Regards,
> Hilmar
>
> Hilmar Berger <hilmar.berger <at> imise.uni-leipzig.de> writes:
>   
>> Hi all,
>>
>> I'm not quite sure if this is a feature or a bug or if I just fail to 
>> understand the documentation:
>>
>> If I use cor() with pairwise.complete.obs and method=pearson, the 
>> result is a
>> scalar: 
>>
>> ->cor(c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="pearson")
>> [1] 0.9819805
>>
>> The documentation says that
>> " '"pairwise.complete.obs"' only works with the '"pearson"' method
>>      for 'cov' and 'var'."
>>
>> Thus, I guess that cor() should work for pairwise.complete.obs and 
>> method = "kendall", or am I misinterpreting that statement ?
>>
>>     
I would interpret it to mean that it does NOT work for "kendall" and
"spearman" and I don't see how you can possibly interpret otherwise.

>> -> c(1,2,3),c(3,4,6),use="pairwise.complete.obs",method="kendall")
>>      [,1]
>> [1,]    1
>>
>> Now the result is a matrix with dimensions (1,1) - strange enough.
>>
>> Note that when I use "all.obs" or "complete.obs" I get a scalar for 
>> method kendall, too.
>>
>> It gets worse if one tries to calculate the correlation between the 
>> columns of two matrices (i.e. cor(x,y) with x and y being a matrix). 
>> Then
>>
>> -> c=matrix(c(1,2,3,3,4,5),nrow=3,ncol=2)
>> -> d=matrix(c(2,3,4,NA,NA,NA),nrow=3,ncol=2)
>> -> cor(c,d,use="pairwise.complete.obs",method="pearson")
>>      [,1] [,2]
>> [1,]    1   NA
>> [2,]    1   NA
>>
>> -> cor(c,d,use="pairwise.complete.obs",method="kendall")
>> Error: 'x' is empty (*translated from german error message*)
>>
>> The behavior is reproducible in R 2.4.1 and 2.6.1 (WinXP). I noticed 
>> that in 2.7.0 something was fixed in cor() related to "complete.obs" 
>> handling
>> - would
>> that fix my problems ?
>>     
Apparently. There are ways to find this out yourself, you know....

The help page still claims that it doesn't work, though.

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list