[R] Principal Component Analysis - Selecting components? + right choice?

Stéphane Dray dray at biomserv.univ-lyon1.fr
Thu Dec 11 13:30:46 CET 2008


You can have look to

*S. Dray*. On the number of principal components: A test of 
dimensionality based on measurements of similarity between matrices. 
/Computational Statistics and Data Analysis/, 52:2228-2237, 2008.

which is implemented in the testdim function of the ade4 package.


Cheers.

Corrado wrote:
> Dear R gurus,
>
> I have some climatic data for a region of the world. They are monthly averages 
> 1950 -2000 of precipitation (12 months), minimum temperature (12 months), 
> maximum temperature (12 months). I have scaled them to 2 km x 2km cells, and 
> I have around 75,000 cells.
>
> I need to feed them into a statistical model as co-variates, to use them to 
> predict a response variable.
>
> The climatic data are obviously correlated: precipitation for January is 
> correlated to precipitation for February and so on .... even precipitation 
> and temperature are heavily correlated. I did some correlation analysis and 
> they are all strongly correlated.
>
> I though of running PCA on them, in order to reduce the number of co-variates 
> I feed into the model.
>
> I run the PCA using prcomp, quite successfully. Now I need to use a criteria 
> to select the right number of PC. (that is: is it 1,2,3,4?)
>
> What criteria would you suggest?
>
> At the moment, I am using a criteria based on threshold, but that is highly 
> subjective, even if there are some rules of thumb (Jolliffe,Principal 
> Component Analysis, II Edition, Springer Verlag,2002). 
>
> Could you suggest something more rigorous?
>
> By the way, do you think I would have been better off by using something 
> different from PCA?
>
> Best,
>   

-- 
Stéphane DRAY (dray at biomserv.univ-lyon1.fr )
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - Lyon I
43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
Tel: 33 4 72 43 27 57       Fax: 33 4 72 43 13 88
http://biomserv.univ-lyon1.fr/~dray/



More information about the R-help mailing list