[R] Reconstructing Datasets

Jari Oksanen jarioksa at sun3.oulu.fi
Wed Mar 2 07:39:10 CET 2005


On Wed, 2005-03-02 at 08:30 +0200, Jari Oksanen wrote:
> On Tue, 2005-03-01 at 20:30 +0000, Laura Quinn wrote:
> > Hi,
> > 
> > Is it possible to recreate "smoothed" data sets in R, by performing a PCA
> > and then reconstructing a data set from say the first 2/3 EOFs?
> > 
> > I've had a look in the help pages and don't seem to find anything
> > relevant.
> > 
> It's not in the R help, but in the books about PCA in help references. 
> 
> This can be done, not quite directly. Most of the hassle comes from the
> centring, and I guess in your case, from scaling of the results. I guess
> it is best to first scale the results like PCA would do, then make the
> low-rank approximation, and then de-scale:
> 
> x <- scale(x, scale = TRUE)
> pc <- prcomp(x)
> 
> Full rank will be:
> 
> xfull <- pc$x %*% pc$rotation

Naturally, I forgot the transposition:

xfull <- pc$x %*% t(pc$rotation)

and the check:

range(x - xfull) 

which should be something in magnitude 1e-12 or better (6e-15 in the
test I run).

> 
> The eigenvalues already are incorporated in pc$x, and you don't have to
> care about them.
> 
> Then rank=3 approximation will be:
> 
> x3 <- pc$x[,1:3] %*% pc$rotation[,1:3]
> 
and the same here:

x3 <- pc$x[,1:3] %*% t(pc$rotation[,1:3])

The moral: cut-and-paste.

> Then you have to "de-scale":
> 
> x3 <- sweep(x3, 2, attr(x, "scaled:scale", "*")
> x3 <- sweep(x3, 2, attr(x, "scaled:center", "+")
> 
And here you need to close the parentheses: 

x3 <- sweep(x3, 2, attr(x, "scaled:scale", "*"))
x3 <- sweep(x3, 2, attr(x, "scaled:center", "+"))

The moral #1: cut-and-paste.
and #2: drink coffee in the morning.

> And here you are. I wouldn't call this a smoothing, though.
> 
> Library 'vegan' can do this automatically for PCA run with function
> 'rda', but there the scaling of raw results is non-conventional (though
> "biplot").
> 

cheers, jari oksanen
-- 
Jari Oksanen <jarioksa at sun3.oulu.fi>




More information about the R-help mailing list