[R] FW: Principal Components in a Linear Model

David Carlson dcarlson at tamu.edu
Fri Nov 22 20:39:14 CET 2013


Bert is correct. 

In addition, you are using prcomp() for your principal
components analysis so the initial principal component loadings
are called "rotation" in contrast to princomp() where they are
called "loadings." So you do not have "rotated" components in
the traditional sense of the word. If you compare the results,
you will see that the two agree (with reflection of the first
four components). To rotate the results you would need to use
varimax() or another function (e.g. principal() in package
psych) that provides more rotation methods. 

Also stored in the output from prcomp() is an object called "x."
These are the principal component scores for each component.
They are uncorrelated and could be used as explanatory variables
in a regression analysis. But

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Bert Gunter
Sent: Friday, November 22, 2013 11:36 AM
To: Chris Wilkinson
Cc: r-help at r-project.org
Subject: Re: [R] Principal Components in a Linear Model

1. Probably not, depending on what you expect to gain from this.
R's
numerical procedures can almost certainly handle the
correlations.

2. Search on "R package for principal components regression"
instead
of rolling your own.There are several (e.g. "chemometrics",
"pls",
etc.)

-- Bert

On Fri, Nov 22, 2013 at 8:47 AM, Chris Wilkinson
<kinsham at verizon.net> wrote:
> My data has correlations between predictors so I think it
would be
> advantageous to rotate the axes with prcomp().
>
>> census <-
>
read.table(paste("http://www.stat.wisc.edu/~rich/JWMULT02dat","T
8-5.DAT",sep
> ="/"),header=F)
>> census
>       V1   V2    V3   V4   V5
> 1  5.935 14.2 2.265 2.27 2.91
> 2  1.523 13.1 0.597 0.75 2.62
> 3  2.599 12.7 1.237 1.11 1.72
> 4  4.009 15.2 1.649 0.81 3.02
> 5  4.687 14.7 2.312 2.50 2.22
> 6  8.044 15.6 3.641 4.51 2.36
> 7  2.766 13.3 1.244 1.03 1.97
> 8  6.538 17.0 2.618 2.39 1.85
> 9  6.451 12.9 3.147 5.52 2.01
> 10 3.314 12.2 1.606 2.18 1.82
> 11 3.777 13.0 2.119 2.83 1.80
> 12 1.530 13.8 0.798 0.84 4.25
> 13 2.768 13.6 1.336 1.75 2.64
> 14 6.585 14.9 2.763 1.91 3.17
>
>> pca1 <- prcomp(census)
>> summary(pca1)
> Importance of components:
>                           PC1    PC2     PC3     PC4     PC5
> Standard deviation     2.6327 1.3361 0.62422 0.47909 0.11897
> Proportion of Variance 0.7413 0.1909 0.04168 0.02455 0.00151
> Cumulative Proportion  0.7413 0.9323 0.97394 0.99849 1.00000
>
>> pca1$rotation # eigenvectors
>            PC1         PC2          PC3         PC4
PC5
> V1 -0.78120807  0.07087183 -0.003656607  0.54171007
0.302039670
> V2 -0.30564856  0.76387277  0.161817438 -0.54479937
0.009279632
> V3 -0.33444840 -0.08290788 -0.014841008  0.05101636
-0.937255367
> V4 -0.42600795 -0.57945799 -0.220453468 -0.63601254
0.172145212
> V5  0.05435431  0.26235528 -0.961759720  0.05127599
-0.024583093
>
> I'd like to create a linear model based on the rotated axes.
>
>> linmod <- lm(y~a+b+....)
>
> Could someone be kind enough to suggest how to code a, b...?
>
> Chris
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible
code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.



More information about the R-help mailing list