[R] efficient code - yet another question

Richard.Cotton at hsl.gov.uk Richard.Cotton at hsl.gov.uk
Thu May 1 10:57:30 CEST 2008


> I've been trying to find a way to improve (if
> possible) the efficiency of the code, especially when the number of
> components to calculate is higher than 100.

> pca.nipals <- function(X, ncomp, iter = 50, toler = 
sqrt(.Machine$double.eps))
> # X...data matrix, ncomp...number of components,
> # iter...maximal number of iterations per component,
> # toler...precision tolerance for calculation of components
> {
> 
>      Xh <- scale(X, center = TRUE, scale = FALSE)
>      nr <- 0
>      T <- NULL; P <- NULL # matrix of scores and loadings
>      for (h in 1:ncomp)
>               {
>                      th <- Xh[, 1]
>                      ende <- FALSE
>                      while (!ende)
>                        {
>                            nr <- nr + 1
>                            ph <- t(crossprod(th, Xh) * as.vector(1 /
> crossprod(th)))
>                            ph <- ph * as.vector(1/sqrt(crossprod(ph)))
>                            thnew <- t(tcrossprod(t(ph), Xh) *
> as.vector(1/(crossprod(ph))))
>                            prec <- crossprod(th-thnew)
>                            th <- thnew
>                            if (prec <= (tol^2)) ende <- TRUE
>                            if (it <= nr) ende <- TRUE # didn't converge
>                        }
> 
>                      Xh <- Xh - tcrossprod(th)
>                      T <- cbind(T, th); P <- cbind(P, ph)
>                      nr <- 0
>                }
>      list(T = T, P = P)
> }

First a disclaimer: I've just had a quick eyeball of your code; I haven't 
run it, so this is speculative.

You've named the tolerance variable as 'toler', but in the line where you 
come to check it, it is called tol:
if (prec <= (tol^2)) ende <- TRUE

I suspect that this means you have 'tol' as a global variable somewhere, 
which may or may not be the number you want.  Even if the correct variable 
is being found, I suspect that you want prec <= tol, rather than the 
square of it (you are probably wasting time on iterations calculating 
excessive decimal places).

Also, it may be slightly faster to use a for loop instead of the while 
loop, as so:

for(j in 1:iter)
{
  #your calculations
  if(prec <=tol) break
}

Regards,
Richie.

Mathematical Sciences Unit
HSL


------------------------------------------------------------------------
ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}



More information about the R-help mailing list