[R] Regarding Principal Component Analysis result Interpretation
bgunter.4567 at gmail.com
Sat Sep 16 01:40:32 CEST 2017
This list is about R programming, not statistics, although they do often
intersect. Nevertheless, this discussion seems to be all about the latter,
not the former, so I think you would do better bringing it to a statistics
list like stats.stackexchange.com rather than here.
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Fri, Sep 15, 2017 at 5:12 AM, Ismail SEZEN <sezenismail at gmail.com> wrote:
> First, see the example at https://isezen.github.io/PCA/
> > On 15 Sep 2017, at 13:43, Shylashree U.R <shylashivashree at gmail.com>
> > Dear Sir/Madam,
> > I am trying to do PCA analysis with "iris" dataset and trying to
> > the result. Dataset contains 150 obs of 5 variables
> > Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> > 1 5.1 3.5 1.4
> > 0.2 setosa
> > 2 4.9 3.0 1.4
> > 0.2 setosa
> > .....
> > .....
> > 150 5.9 3.0 5.1
> > verginica
> > now I used 'prcomp' function on dataset and got result as following:
> >> print(pc)
> > Standard deviations (1, .., p=4):
> >  1.7083611 0.9560494 0.3830886 0.1439265
> > Rotation (n x k) = (4 x 4):
> > PC1 PC2 PC3 PC4
> > Sepal.Length 0.5210659 -0.37741762 0.7195664 0.2612863
> > Sepal.Width -0.2693474 -0.92329566 -0.2443818 -0.1235096
> > Petal.Length 0.5804131 -0.02449161 -0.1421264 -0.8014492
> > Petal.Width 0.5648565 -0.06694199 -0.6342727 0.5235971
> > I'm planning to use PCA as feature selection process and remove variables
> > which are corelated in my project, I have interpreted the PCA result, but
> > not sure is my interpretation is correct or wrong.
> You want to “remove variables which are correlated”. Correlated among
> themselves? If so, why don’t you create a pearson correlation matrix (see
> ?cor) and define a threshold and remove variables which are correlated
> according to this threshold? Perhaps I did not understand you correctly,
> excuse me.
> for iris dataset, each component will be as much as correlated with PC1
> and remaining part will be correlated PC2 and so on. Hence, you can
> identify which variables are similar in terms of VARIANCE. You can
> understand it if you examine the example that I gave above.
> In PCA, you can also calculate the correlations between variables and PCs
> but this shows you how PCs are affected by this variables. I don’t know how
> you plan to accomplish feature selection process so I hope this helps you.
> Also note that resources part at the end of example.
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> PLEASE do read the posting guide http://www.R-project.org/
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help