[R] A basic statistics question

(Ted Harding) Ted.Harding at wlandres.net
Wed Aug 13 00:22:13 CEST 2014

On 12-Aug-2014 21:41:52 Rolf Turner wrote:
> On 13/08/14 07:57, Ron Michael wrote:
>> Hi,
>> I would need to get a clarification on a quite fundamental statistics
>> property, hope expeRts here would not mind if I post that here.
>> I leant that variance-covariance matrix of the standardized data is equal to
>> the correlation matrix for the unstandardized data. So I used following
>> data.
> <SNIP>
>> (t(Data_Normalized) %*% Data_Normalized)/dim(Data_Normalized)[1]
>> Point is that I am not getting exact CORR matrix. Can somebody point
>> me what I am missing here?
> You are using a denominator of "n" in calculating your "covariance" 
> matrix for your normalized data.  But these data were normalized using 
> the sd() function which (correctly) uses a denominator of n-1 so as to 
> obtain an unbiased estimator of the population standard deviation.
> If you calculated
>     (t(Data_Normalized) %*% Data_Normalized)/(dim(Data_Normalized)[1]-1)
> then you would get the same result as you get from cor(Data) (to within 
> about 1e-15).
> cheers,
> Rolf Turner

One could argue about "(correctly)"!

>From the "descriptive statistics" point of view, if one is given a single
number x, then this dataset has no variation, so one could say that
sd(x) = 0. And this is what one would get with a denominator of "n".

But if the single value x is viewed as sampled from a distribution
(with positive dispersion), then the value of x gives no information
about the SD of the distribution. If you use denominator (n-1) then
sd(x) = NA, i.e. is indeterminate (as it should be in this application).

The important thing when using pre-programmed functions is to know
which is being used. R uses (n-1), and this can be found from
looking at


or (with more detail) at


Ron had assumed that the denominator was n, apparently not being aware
that R uses (n-1).

Just a few thoughts ...

E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 12-Aug-2014  Time: 23:22:09
This message was sent by XFMail

More information about the R-help mailing list