[R] PCA

Sat Apr 26 16:34:49 CEST 2003

I think you can get what you want with the "svd" or "La.svd".  Consider 
the following:

	  The singular value decomposition ("svd" or "La.svd" in R 1.6.2) is 
something like the following:  Any n x m matrix A can be written in the 
following format:

	 A = P Lam Q,

where P and Q are orthogonal, and Lam ia diagonal.  If n < m, then we 
can consider P to be n x n, so P'P = PP' = I, Lam = n x n diagonal, and 
Q = n x m with QQ' = I.

	  Now suppose A = your data matrix minus the column means.  Then the 
sample covariance matrix, Var.A, can be written as follows:

	Var.A = AA'/(n-1) = P Lam^2 P' / (n-1).

P gives the principle components and Q the corresponding loadings or 
vice versa, I forget which now, and Lam^2/(n-1) are their associated 
variances.

If you have trouble with the details, please let us know.

hope this helps. spencer graves

Andrew C. Ward wrote:
> What about trying a sub-set of the data?
> 
> Regards,
> 
> Andrew C. Ward
> 
> CAPE Centre
> Department of Chemical Engineering
> The University of Queensland
> Brisbane Qld 4072 Australia
> andreww at cheque.uq.edu.au
> 
> 
> 
> On Saturday, April 26, 2003 10:44 AM, array chip 
> [SMTP:arrayprofile at yahoo.com] wrote:
> 
>>Hi, I have a dataset of dimensions 50 x 15000, and tried to use
>>princomp or prcomp on this dataset with 15000 columns as
>>variables, but it seems that the 2 functions can;t handle this
>>large number of columns, anyone has nay suggestions to get
>>around this? Thanks
>>
>>
>>---------------------------------
>>
>>
>>	[[alternate HTML version deleted]]
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help