[R] Scaling in predict.prcomp

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Apr 20 17:54:28 CEST 2008


On Sun, 20 Apr 2008, Gad Abraham wrote:

> Hi,
>
> Say x.train is a matrix of covariates that I want to do PCA on, so I can
> do regression on its principal components, and x.test is a test set of
> the same covariates on which I want to evaluate the regression fit. I
> would like the covariates to be centred and scaled:
>
> p <- prcomp(x.train, center=TRUE, scale=TRUE)
> x.train.pc <- predict(p)
>
> Now I want to get the PCs from the test set.

The way to do that is to call prcomp() on the test set.

If you want to project new data onto the PCs of the training set (as a set 
of axes in the data space), you just use predict(p, newdata=).

> Should I use the same center and scale vectors from the training set:
>
> x.test.pc <- predict(p, newdata=x.test, center=p$center, scale=p$center)
>
> or use the training set's own centers and scales:
>
> x.test.pc <- predict(p, newdata=x.test, center=TRUE, scale=TRUE)

I see no evidence that those additional arguments are used.

predict.prcomp uses the origin of the training set's PCs, since it is that 
coordinate system which you are projecting onto.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list