[R] Efficient way to convert covariance to Euclidian distance matrix

S Ellison S.Ellison at lgcgroup.com
Thu Oct 31 12:16:38 CET 2013



> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Takatsugu Kobayashi
> 
> I am struggling to come up with an efficient vectorized way to convert
> 20Kx20K covariance matrix to a Euclidian distance matrix as a surrogate for
> dissimilarity matrix. Hopefully I can apply multidimensional scaling for
> mapping these 20K points (commercial products).
> 
> I understand that Distance(ij) = sigma(i) + sigma(j) - 2cov(ij). 

I suspect there's a typo or two in here.

sigma(i)^2 + sigma(j)^2 - 2cov(ij)

would be the variance of a difference x[i ]- x[j]. That's not in the same units as the difference itself, so one might well want the standard deviation of the difference, that is, sqrt(sigma(i)^2 + sigma(j)^2 - 2cov(ij)).

I don't envy your attempt to work with 20k*20k matrices, though. That's about 3Gbytes per object, and a lot of distances for MDS to optimise. 
If it's just about visual display, perhaps prcomp on the original data would provide (visually) similar results without the overhead of a large covariance matrix?


S Ellison


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list