[R] pearson's correlation

Eik Vettorazzi E.Vettorazzi at uke.uni-hamburg.de
Sat Apr 5 23:39:10 CEST 2008


The difference may be due to different handling of missing values.

If you do cor(x,y) "by hand" in excel, you use all available information 
of x and y to calculate sd(x) and sd(y) seperately. But cov(x,y) in 
excel will use only complete pairs of (x,y), which is likely not the 
same set. So your sd and cov (and mean within cov) will be calculated on 
different data. In R, if you use the option use="complete.obs" in cor 
all intermediate calculations will be done on the same (complete) set.
If that is the case of your problem you should got an error message if 
you tried cor() in R without this option on your dataset. But without an 
explanatory example of what you did, this is just guessing.

hth.
Ake Nauta schrieb:
> Hello,
>  
> I used the function cor to calculate the pearson correlation coefficient between variables. However, the resulting values do not correspond to the outcome of my excel-calculations, for which I used the formula Cor(x,y)=Cov(x,y)/(SD(x)*SD(y))
> So my question is: How does the function "cor" compute the pearson correlation coefficient?
>  
> Thank you in advance,
>  
> Ake Nauta
> _________________________________________________________________
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   



-- 
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus


More information about the R-help mailing list