[R] very fast OLS regression?

Bernardo Rangel Tura tura at centroin.com.br
Thu Mar 26 11:00:07 CET 2009


On Wed, 2009-03-25 at 22:11 +0100, Dimitris Rizopoulos wrote:
> check the following options:
> 
> ols1 <- function (y, x) {
>      coef(lm(y ~ x - 1))
> }
> 
> ols2 <- function (y, x) {
>      xy <- t(x)%*%y
>      xxi <- solve(t(x)%*%x)
>      b <- as.vector(xxi%*%xy)
>      b
> }
> 
> ols3 <- function (y, x) {
>      XtX <- crossprod(x)
>      Xty <- crossprod(x, y)
>      solve(XtX, Xty)
> }
> 
> ols4 <- function (y, x) {
>      lm.fit(x, y)$coefficients
> }
> 
> # check timings
> MC <- 500
> N <- 10000
> 
> set.seed(0)
> x <- matrix(rnorm(N*MC), N, MC)
> y <- matrix(rnorm(N*MC), N, MC)
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
> 
> invisible({gc(); gc(); gc()})
> system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))

Hi Dimitris and Ivo

I think this is not a fair comparison, look this

x[8,100]<-NA

system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
   user  system elapsed 
  8.765   0.000   8.762 
 

 system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
Error in solve.default(t(x) %*% x) : 
  system is computationally singular: reciprocal condition number = 0
Timing stopped at: 0 0 0.002 
 

 system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
Error in solve.default(XtX, Xty) : 
  system is computationally singular: reciprocal condition number = 0
Timing stopped at: 0 0 0.001 
 

 system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))
Error in lm.fit(x, y) : NA/NaN/Inf in foreign function call (arg 1)
Timing stopped at: 0 0 0.001 

So routines ols2, ols3 and ols4 only functional in fully matrix if have
one NA this functions don't run.

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil




More information about the R-help mailing list