[Rd] the incredible lightness of crossprod
andy_liaw at merck.com
Thu Jan 27 20:33:36 CET 2005
> From: Douglas Bates
> Patrick Burns wrote:
> > The following is at least as much out of intellectual curiosity
> > as for practical reasons.
> > On reviewing some code written by novices to R, I came
> > across:
> > crossprod(x, y)[1,1]
> > I thought, "That isn't a very S way of saying that, I wonder
> > what the penalty is for using 'crossprod'." To my surprise the
> > penalty was substantially negative. Handily the client had S-PLUS
> > as well -- there the sign of the penalty was as I had expected, but
> > the order of magnitude was off.
> > Here are the timings of 1 million computations on vectors of
> > length 1000. This is under Windows, R version 1.9.1 and S-PLUS
> > 6.2 (on the same machine).
> > Command R
> > sum(x * y) 28.61
> > crossprod(x, y)[1,1] 6.77 2256.2
> > Another example is when computing the sums of the columns of a
> > matrix. For example:
> > set.seed(1)
> > jjm <- matrix(rnorm(600), 5)
> > Timings for this under Windows 2000 with R version 2.0.1 (on an
> > old chip running at about 0.7Ghz) for 100,000 computations are:
> > apply(jjm, 2, sum) 536.59
> > colSums(jjm) 18.26
> > rep(1,5) %*% jjm 15.41
> > crossprod(rep(1,5), jjm) 13.16
> > (These timings seem to be stable across R versions and on at least
> > one Linux platform.)
> > Andy Liaw showed another example of 'crossprod' being fast a couple
> > days ago on R-help.
> > Questions for those with a more global picture of the code:
> > * Is the speed advantage of 'crossprod' inherent, or is it because
> > more care has been taken with its implementation than the other
> > functions?
> > * Is 'crossprod' faster than 'sum(x * y)' because 'crossprod' is
> > going to BLAS while 'sum' can't?
> For a numeric matrix crossprod ends up calling level 3 BLAS; either
> dsyrk for the single argument case or dgemm for the two
> argument case.
> Especially in accelerated versions of the BLAS like Atlas or Goto's
> BLAS, those routines are hideously efficient and that's where you are
> seeing the big gain in speed.
> By the way, you didn't mention if you had an accelerated BLAS
> with R. Do you?
For the case of crossprod(x, w) for computing weighted mean of x (the
posting that Pat referred to), I tried the ATLAS-linked Rblas.dll on CRAN,
and it's actually slower than the stock Rblas.dll. I believe Duncan M. had
observed similar things for small-ish problems. (I used a Pentium M laptop,
and tried both the P3 and P4 versions.)
So, I think my main point is that even with non-optimized BLAS crossprod can
be way faster.
> R-devel at stat.math.ethz.ch mailing list
More information about the R-devel