[R] prcomp: where do sdev values come from?

David L Carlson dcarlson at tamu.edu
Thu Jun 21 23:36:38 CEST 2012


This should help you figure it out:

getAnywhere("prcomp.default")

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Adams, Sky
> Sent: Wednesday, June 20, 2012 3:15 PM
> To: r-help at r-project.org
> Subject: [R] prcomp: where do sdev values come from?
> 
> In the manual page for prcomp(), it says that sdev is "the standard
> deviations of the principal components (i.e., the square roots of the
> eigenvalues of the covariance/correlation matrix, though the
> calculation is actually done with the singular values of the data
> matrix)."  However, this is not what I'm finding.  The values appear
> to be the standard deviations of a reprojection of the data, as in the
> example below.
> 
> >
> test=matrix(data=c(1,1,2,2,0,0,2,1,2,1,0,2,1,2,1,2,2,0),nrow=3,ncol=6)
> > test=standardize(test)
> > test
>            [,1]      [,2]       [,3] [,4]       [,5]      [,6]
> [1,] -0.7071068  2.828427  0.8944272    0 -0.7071068  1.414214
> [2,] -0.7071068 -1.414214 -1.7888544   -2  1.4142136  1.414214
> [3,]  1.4142136 -1.414214  0.8944272    2 -0.7071068 -2.828427
> 
> > res1
> Standard deviations:
> [1] 3.622043e+00 2.877639e+00 1.812987e-16
> 
> Rotation:
>             PC1         PC2        PC3
> [1,]  0.3330598 -0.07347383  0.1418171
> [2,] -0.2319540  0.79957999  0.3837549
> [3,]  0.2745904  0.41276093 -0.8541541
> [4,]  0.5186794  0.23838204  0.2096053
> [5,] -0.2170828 -0.32631616 -0.1153013
> [6,] -0.6661196  0.14694766 -0.2140375
> 
> > res2=svd(test)
> >res2
> $d
> [1] 5.122342e+00 4.069596e+00 1.153778e-15
> 
> $u
>            [,1]       [,2]      [,3]
> [1,] -0.2800491  0.7669675 0.5773503
> [2,] -0.5241888 -0.6260134 0.5773503
> [3,]  0.8042379 -0.1409541 0.5773503
> 
> $v
>            [,1]        [,2]       [,3]
> [1,]  0.3330598 -0.07347383  0.7608840
> [2,] -0.2319540  0.79957999  0.3438537
> [3,]  0.2745904  0.41276093 -0.4028185
> [4,]  0.5186794  0.23838204  0.1543444
> [5,] -0.2170828 -0.32631616  0.3236952
> [6,] -0.6661196  0.14694766  0.1093469
> 
> Here, the values in $d are not the squares of the standard deviations.
> 
> > res3=test %*% res2$v
> > res3
>           [,1]       [,2]          [,3]
> [1,] -1.434507  3.1212479 -7.494005e-16
> [2,] -2.685074 -2.5476217  3.996803e-15
> [3,]  4.119582 -0.5736263 -2.053913e-15
> > apply(res3,2,sd)
> [1] 3.622043e+00 2.877639e+00 3.184320e-15
> 
> As shown above, taking the standard deviation of the columns of the
> reprojection of the eigenvectors on the original "data" gives the same
> values as the standard deviations given by prcomp().  Can anyone
> reconcile this example with the manual page for prcomp() or explain
> how the standard deviations are calculated?
> 
> the code for standardize():
> 
> standardize<-function(data){
>   numSamples = nrow(data);
> 
>   # calculate mean for each marker
>   mn=apply(data,2,mean);
>   numMarkers = ncol(data);
>   #standardize every entry
>   for(j in 1:numMarkers){
>     for(k in 1:numSamples){
>       temp = sqrt(mn[j]/2*(1-mn[j]/2));
>       data[k,j] = (data[k,j] - mn[j])/temp;
>     }
>   }
>   data;
> }
> 
> Thank you,
> Sky
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list