[R] Confused by SVD and Eigenvector Decomposition in PCA

Feng Zhang f0z6305 at labs.tamu.edu
Sat Feb 8 04:17:03 CET 2003


I used Matlab to do this case study.
>x = randn(200,3); %%generating a 200x3 Gaussian matrix
>[a,b,c]=svd(x); %%SVD composition
>S=diag(b)
  S =[15.6765   14.8674   13.4016]'

>S(1)^2/sum(S.^2);
 0.3802

>ZeroedX = X - repmat(mean(X),200,1); %%ZeroedX is now zero centered data
>C = cov(ZeroedX); %%Covariance matrix of ZeroedX
>[U,L] = eig(C); %% Eigen decompostion of C
> SE = diag(L);
  [0.8918    1.1098    1.2337]'
>SE(1)/sum(SE)
  0.3813

This is the case that I was confused by.

Fred
----- Original Message -----
From: "Liaw, Andy" <andy_liaw at merck.com>
To: "'Feng Zhang'" <f0z6305 at labs.tamu.edu>
Sent: Friday, February 07, 2003 6:25 PM
Subject: RE: [R] Confused by SVD and Eigenvector Decomposition in PCA


> I've already shown you one example.  If that's not enough, here's another
> one:
>
> > set.seed(1)
> > x <- matrix(runif(1e3), 50, 20)
> > La.eigen(crossprod(x))$value
>  [1] 258.5242317   9.3638224   8.7213839   7.7425270   6.5057190
6.2719056
>  [7]   5.6582657   4.5002047   4.2289555   3.9098726   3.7172642
3.2826449
> [13]   2.8758329   2.6907474   2.3300505   1.9700120   1.3191512
1.0228788
> [19]   0.8883083   0.5883287
> > La.svd(x)$d^2
>  [1] 258.5242317   9.3638224   8.7213839   7.7425270   6.5057190
6.2719056
>  [7]   5.6582657   4.5002047   4.2289555   3.9098726   3.7172642
3.2826449
> [13]   2.8758329   2.6907474   2.3300505   1.9700120   1.3191512
1.0228788
> [19]   0.8883083   0.5883287
>
> Where's your example of this not working?
>
> Andy
>
>
> > -----Original Message-----
> > From: Feng Zhang [mailto:f0z6305 at labs.tamu.edu]
> > Sent: Friday, February 07, 2003 12:07 PM
> > To: antonio rodriguez; R-Help
> > Subject: Re: [R] Confused by SVD and Eigenvector Decomposition in PCA
> >
> >
> > Thanks for those replies.
> >
> > But I tested several cases, and found the two
> > percentage from SVD and EVD are not
> > the same.
> > So how to explain the difference and which
> > one should be the right one for use
> > in PCA?
> >
> >
> > ----- Original Message -----
> > From: "antonio rodriguez" <arv at ono.com>
> > To: "Feng Zhang" <f0z6305 at labs.tamu.edu>; "R-Help"
> > <r-help at stat.math.ethz.ch>
> > Sent: Friday, February 07, 2003 2:36 AM
> > Subject: Re: [R] Confused by SVD and Eigenvector Decomposition in PCA
> >
> >
> > > Hi Feng,
> > >
> > > AFIK SVD analysis provides a one-step method for computing all the
> > > components of the eigen value problem, without the need to
> > compute and
> > > store big covariance matrices. And also the resulting
> > decomposition is
> > > computationally more stable and robust.
> > >
> > > Cheers,
> > >
> > > Antonio Rodriguez
> > >
> > >
> > > ----- Original Message -----
> > > From: "Feng Zhang" <f0z6305 at labs.tamu.edu>
> > > To: "R-Help" <r-help at stat.math.ethz.ch>
> > > Sent: Thursday, February 06, 2003 7:03 PM
> > > Subject: [R] Confused by SVD and Eigenvector Decomposition in PCA
> > >
> > >
> > > > Hey, All
> > > >
> > > > In principal component analysis (PCA), we want to know how many
> > > percentage
> > > > the first principal component explain the total variances
> > among the
> > > data.
> > > >
> > > > Assume the data matrix X is zero-meaned, and
> > > > I used the following procedures:
> > > > C = covriance(X) %% calculate the covariance matrix;
> > > > [EVector,EValues]=eig(C) %%
> > > > L = diag(EValues) %%L is a column vector with eigenvalues as the
> > > elements
> > > > percent = L(1)/sum(L);
> > > >
> > > >
> > > > Others argue using Sigular Value Decomposition(SVD) to
> > > > calculate the same quantity, as:
> > > > [U,S,V]=svd(X);
> > > > L = diag(S);
> > > > L = L.^2;
> > > > percent = L(1)/sum(L);
> > > >
> > > >
> > > > So which way is the correct method to calculate the percentage
> > > explained by
> > > > the first principal component?
> > > >
> > > > Thanks for your advices on this.
> > > >
> > > > Fred
> > > >
> > > > ______________________________________________
> > > > R-help at stat.math.ethz.ch mailing list
> > > > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > >
> > >
> > > ---
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
>
> --------------------------------------------------------------------------
----
> Notice: This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that
may be confidential, proprietary copyrighted and/or legally privileged, and
is intended solely for the use of the individual or entity named on this
message.  If you are not the intended recipient, and have received this
message in error, please immediately return this by e-mail and then delete
it.
>
>
============================================================================
==
>




More information about the R-help mailing list