>> I would need to get a clarification on a quite fundamental statistics
>> property, hope expeRts here would not mind if I post that here.
>> I leant that variance-covariance matrix of the standardized data is equal to
>> the correlation matrix for the unstandardized data. So I used following
>> data.
>> (t(Data_Normalized) %*% Data_Normalized)/dim(Data_Normalized)
>> Point is that I am not getting exact CORR matrix. Can somebody point
>> me what I am missing here?
> You are using a denominator of "n" in calculating your "covariance"
> matrix for your normalized data.  But these data were normalized using
> the sd() function which (correctly) uses a denominator of n-1 so as to
> obtain an unbiased estimator of the population standard deviation.
> If you calculated
>
>     (t(Data_Normalized) %*% Data_Normalized)/(dim(Data_Normalized)-1)
>
> then you would get the same result as you get from cor(Data) (to within
>From the "descriptive statistics" point of view, if one is given a single
number x, then this dataset has no variation, so one could say that
sd(x) = 0. And this is what one would get with a denominator of "n".

But if the single value x is viewed as sampled from a distribution
(with positive dispersion), then the value of x gives no information
about the SD of the distribution. If you use denominator (n-1) then
sd(x) = NA, i.e. is indeterminate (as it should be in this application).

The important thing when using pre-programmed functions is to know
which is being used. R uses (n-1), and this can be found from
looking at

?sd

or (with more detail) at

?cor

Ron had assumed that the denominator was n, apparently not being aware
that R uses (n-1).

Just a few thoughts ...
Ted.

