[R] problem with PCA

Denis Francisci denis.francisci at gmail.com
Fri Mar 10 11:44:43 CET 2017


Hi all.
I'm newbie in PCA by I don't understand a behaviour of R.
I have this data matrix:

>mx_fus
  height diam  hole  weight
1    2.3  3.5  1.1   18
2    2.0  3.5  0.9   17
3    3.8  4.3  0.7   34
4    2.1  3.4  0.9   15
5    2.3  3.8  1.0   19
6    2.2  3.8  1.0   19
7    3.2  4.4  0.9   34
8    3.0  4.3  1.0   30
9    2.8  3.9  0.9   21
10   3.3  4.2  1.1   33
11   2.3  3.9  0.9   25
12   2.3  3.3  0.5   17
13   0.9  2.4  0.4   10
14   1.4  2.4  0.5   10
15   2.2  3.6  0.7   22
16   2.9  3.8  0.8   30
17   2.9  3.5  0.6   27
18   2.3  3.5  0.5   24
19   1.8  2.3  0.5   29
20   1.4  2.5  0.6   34
21   0.8  2.3  0.6   21
22   1.8  2.4  0.6   23
23   1.5  2.2  0.6    7
24   0.9  1.7  0.4   14
25   2.1  2.2  0.5   25
26   1.3  2.4  0.6   33
27   1.3  2.7  0.4   39
28   0.5  2.2  0.5   13
29   1.4  4.2  0.8   23
30   1.6  2.0  0.4   30
31   1.4  2.2  0.6   25
32   1.8  2.5  0.6   28
33   1.4  2.6  0.6   41
34   1.6  2.3  0.3   32
35   1.6  2.5  0.5   41
36   2.8  2.9  0.8   47
37   0.6  2.5  0.8   21
38   1.6  2.8  0.7   13
39   1.7  3.3  0.8   17
40   1.6  3.9  1.9   20
41   1.4  4.7  0.9   26
42   1.2  4.2  0.7   21
43   3.5  4.2  0.9   47
44   2.3  3.6  0.7   24
45   2.3  3.4  0.4   21
46   1.9  2.6  0.7   14
47   1.9  3.0  0.7   15
48   2.7  3.7  0.9   26
49   3.0  3.8  0.7   35
50   1.2  2.0  0.7    5
51   1.6  2.5  0.5   15
52   1.3  2.6  0.5   16
53   2.5  3.9  0.9   32
54   0.9  3.3  0.6    9
55   1.8  2.4  0.5   17
56   2.4  3.7  1.1   30
57   2.1  3.5  1.1   22
58   2.6  3.9  1.0   38
59   2.6  3.6  1.0   27
60   2.6  4.1  1.0   34
61   2.9  3.6  0.8   32
62   2.6  3.3  0.7   22
63   1.8  2.5  0.7   26
64   3.0  2.8  1.3    2
65   0.5  2.2  0.4    3
66   1.9  3.4  0.7   14
67   1.4  3.8  0.9   18
68   2.0  4.0  1.0   30
69   3.1  4.0  1.3   21
70   2.5  4.0  0.8   19
71   2.5  4.5  1.0   20
72   1.8  3.5  1.4   18
73   2.1  3.5  1.4   25
74   1.5  2.6  0.5    9
75   2.8  3.2  1.2   16
76   1.0  5.0  0.3   32
77   0.3  5.8  0.5   56
78   0.5  1.5  0.2    1
79   0.7  1.4  0.2    1
80   0.5  1.3  0.2    1
81   0.7  3.3  0.4    7
82   1.9  4.7  1.0   24
83   3.1  4.2  0.9   49
84   2.8  3.6  0.7   28
85   2.7  3.2  0.7   29
86   3.0  4.0  0.9   36
87   1.7  2.7  0.7   14
88   1.5  2.9  0.7   18
89   2.9  3.5  0.7   30
90   3.0  3.4  0.8   30
91   2.0  2.8  0.5   14
92   2.4  3.5  0.7   24
93   0.8  4.1  0.6   12
94   1.7  2.5  0.5   23
95   1.4  2.4  0.8   31
96   1.5  2.7  0.4   20
97   2.6  3.7  0.6   31
98   2.6  3.0  0.6   18
99   2.5  5.0  0.7   40
100  2.5  3.7  0.5   30
101  2.4  2.9  0.7   17
102  2.3  3.0  0.5   15
103  2.2  3.3  0.6   19
104  1.5  2.1  0.5    5
105  2.0  2.2  0.5   10
106  2.6  3.5  0.6   26
107  2.3  3.0  0.6   15
108  2.5  4.5  0.7   40
109  2.1  3.1  0.5   15
110  1.3  2.1  0.8   14
111  0.8  2.5  0.2    5
112  0.6  3.1  0.7    8

I perform a PCA in R

>pca<-prcomp(mx_fus,scale=TRUE)
>biplot(pca, choices = c(1,2), cex=0.7)

The biplot put the arrows of diam and height very near on the first
component axis.
So I understand that these 2 variables are well represented in the PC1 and
they are correlated each other.
But if I test the correlation, the value o correlation coefficient is low

>cor(mx_fus[,1],mx_fus[,2])
0.4828185

Why the plot says a thing and correlation function says the opposite?
Two near arrows don't represent a strong correlation between the 2
variables (as I read in some manuals), but only with the component axis?

Than's in advance

	[[alternative HTML version deleted]]



More information about the R-help mailing list