[R] crossprod vs %*% timing

Wed Oct 6 11:37:38 CEST 2004

You can study that the order of the operation has an effect on
the times of the computations.

<<*>>=
f1 <- function(a,X){ ignore <- t(a) %*% X %*% a               }
f2 <- function(a,X){ ignore <- crossprod(t(crossprod(a,X)),a) }
f3 <- function(a,X){ ignore <- crossprod(a,X) %*% a           }
f4 <- function(a,X){ ignore <- (t(a) %*% X) %*% a               }
f5 <- function(a,X){ ignore <- t(a) %*% (X %*% a)               }
f6 <- function(a,X){ ignore <- crossprod(a,crossprod(X,a)) }

a <- rnorm(100); X <- matrix(rnorm(10000),100,100)

print(system.time( for(i in 1:10000){ a1<-f1(a,X)}))
print(system.time( for(i in 1:10000){ a2<-f2(a,X)}))
print(system.time( for(i in 1:10000){ a3<-f3(a,X)}))
print(system.time( for(i in 1:10000){ a4<-f4(a,X)}))
print(system.time( for(i in 1:10000){ a5<-f5(a,X)}))
print(system.time( for(i in 1:10000){ a6<-f6(a,X)}))
c(a1,a2,a3,a4,a5,a6)

@
output-start
[1] 4.06 0.04 4.11 0.00 0.00
[1] 1.48 0.00 1.53 0.00 0.00
[1] 1.17 0.00 1.22 0.00 0.00
[1] 4.10 0.01 4.39 0.00 0.00
[1] 2.58 0.01 3.24 0.00 0.00
[1] 1.10 0.00 1.29 0.00 0.00
Wed Oct  6 11:26:38 2004
[1] -79.34809 -79.34809 -79.34809 -79.34809 -79.34809 -79.34809
output-end

Peter Wolf

Robin Hankin wrote:

> Hi
>
> the manpage says that crossprod(x,y) is formally equivalent to, but
> faster than, the call 't(x) %*% y'.
>
> I have a vector 'a' and a matrix 'A', and need to evaluate 't(a) %*% A
> %*% a' many many times, and performance is becoming crucial.  With
>
> f1 <- function(a,X){ ignore <- t(a) %*% X %*% a               }
> f2 <- function(a,X){ ignore <- crossprod(t(crossprod(a,X)),a) }
> f3 <- function(a,X){ ignore <- crossprod(a,X) %*% a           }
>
> a <- rnorm(100)
> X <- matrix(rnorm(10000),100,100)
>
> print(system.time( for(i in 1:10000){ f1(a,X)}))
> print(system.time( for(i in 1:10000){ f2(a,X)}))
> print(system.time( for(i in 1:10000){ f3(a,X)}))
>
>
> I get something like:
>
> [1] 2.68 0.05 2.66 0.00 0.00
> [1] 0.48 0.00 0.49 0.00 0.00
> [1] 0.29 0.00 0.31 0.00 0.00
>
> with quite low variability from run to run.  What surprises me is the
> third figure: about 40% faster than the second one, the extra time
> possibly related to the call to t() (and Rprof shows about 35% of
> total time in t() for my application).
>
> So it looks like f3() is the winner hands down, at least for this
> task.  What is a good way of thinking about such issues?  Anyone got
> any performance tips?
>
> I quite often need things like 'a %*% X %*% t(Y) %*% Z %*% t(b)' which
> would be something like
> crossprod(t(crossprod(t(crossprod(t(crossprod(a,X)),t(Y))),Z)),t(b))
> (I think).
>
> (R-1.9.1, 2GHz G5 PowerPC, MacOSX10.3.5)
>