[Rd] Incorrect handling of NA's in cor() (PR#6750)

ligges at statistik.uni-dortmund.de ligges at statistik.uni-dortmund.de
Fri Apr 9 19:35:33 CEST 2004


Marek Ancukiewicz wrote:
> Dear Uwe,
> 
> You are wrong. 

Whoops. My apologies!!!


In R-1.9.0 beta I get:

cor(x[!is.na(x)&!is.na(y)],y[!is.na(x)&!is.na(y)],method="s")
# [1] -0.4
cor(x,y,use="complete.obs", method="s")
# [1] -0.5291503

I'll take a look!

Uwe



> First, I've read the help file before
> submitting the report. For two variables,
> use="pairwise.complete.obs" and use="complete.obs" should be
> equivalent, shouldn't it?  Of sourse, the results will be
> different when we have more than 2 variables. Second, with the
> call you proposed I am also getting incorrect result:
> 
> 
>>cor(x, y, use="pairwise.complete.obs", method="s")
> 
> [1] -0.1428571
> 
> The correct result is -0.4, as correctly calculated by
> cor.test()
> 
> Regards
> 
> Marek Ancukiewicz
> 
> 
> 
> 
>>X-Original-To: msa at biostat.mgh.harvard.edu
>>Date: Fri, 09 Apr 2004 19:06:47 +0200
>>From: Uwe Ligges <ligges at statistik.uni-dortmund.de>
>>Organization: Fachbereich Statistik, Universitaet Dortmund
>>X-Accept-Language: en-us, en, de-de, de
>>Cc: R-bugs at biostat.ku.dk
>>
>>msa at biostat.mgh.harvard.edu wrote:
>>
>>>Full_Name: Marek Ancukiewicz
>>>Version: 1.8.1
>>>OS: Linux
>>>Submission from: (NULL) (132.183.12.87)
>>>
>>>
>>>Function cor() incorrectly handles missing observation with method="spearman":
>>>
>>>
>>>
>>>>x <- c(1,2,3,NA,5,6)
>>>>y <- c(4,NA,2,5,1,3)
>>>>cor(x,y,use="complete.obs",method="s")
>>>
>>>[1] -0.1428571
>>>
>>>
>>>>cor(x[!is.na(x)&!is.na(y)],y[!is.na(x)&!is.na(y)],method="s")
>>>
>>>[1] -0.4
>>>
>>>These two results should be the same.
>>>
>>
>>
>>No! Please read at least the help file, ?cor, before submitting a bug 
>>report:
>>
>>
>>"If use is "complete.obs" then missing values are handled by casewise 
>>deletion. Finally, if use has the value "pairwise.complete.obs" then the 
>>correlation between each pair of variables is computed using all 
>>complete pairs of observations on those variables."
>>
>>
>>Hence
>>   cor(x, y, use="pairwise.complete.obs", method="s")
>>is what you expect ...
>>
>>Uwe Ligges
>>



More information about the R-devel mailing list