[Rd] [R-SIG-Mac] BLAS benchmarks on R 2.12.0
simon.urbanek at r-project.org
Wed Nov 3 11:31:42 CET 2010
On Nov 3, 2010, at 12:07 AM, Michael Spiegel wrote:
> I would like to apologize for cross-posting this message. But I
> realized that one of my questions may be more appropriate for the
> SIG-Mac mailing list rather than the R-devel mailing list. You may
> wish to ignore the other parts of the email. My question is the
> "I can't seem to reproduce the speed of the 2.11.1 reference BLAS
> library. What compiler, which version of the compiler, and what flags
> are used when an [OS X] R binary is distributed? My test machine is a
> Mac Pro, so 32-bit x86 architecture."
The current build flags for i386 are listed in
They did not change recently so they were the same for 2.11.1. The compiler used is gcc-4.2 from Xcode 3.1.4 (R is built on OS X 10.5.8) with the corresponding Fortran form the tools page.
> ---------- Forwarded message ----------
> From: Michael Spiegel <michael.m.spiegel at gmail.com>
> Date: Sun, Oct 31, 2010 at 12:41 PM
> Subject: BLAS benchmarks on R 2.12.0
> To: r-devel at stat.math.ethz.ch
> Cc: OpenMx Developers <openmx-developers at list.mail.virginia.edu>
> I saw on the mailing list and in the NEWS file that some unsafe math
> transformations were disabled for the reference BLAS implementation
> that is used in R. We have a set of performance tests for the OpenMx
> library, and some of the tests have a x3-10 slowdown in R 2.12.0
> versus 2.11.1. When I copy the shared library libRblas.0.dylib from
> the 2.11.1 installation into the 2.12.0 installation, the slowdown
> goes away. It seems reasonable that BLAS should conform to IEEE
> requirements. For the purposes of our library, we are considering two
> options but I need some advice on both choices:
> 1) Compile the reference BLAS implementation with unsafe optimizations
> and include it as a part of the OpenMx library. I can't seem to
> reproduce the speed of the 2.11.1 reference BLAS library. What
> compiler, which version of the compiler, and what flags are used when
> an R binary is distributed? My test machine is a Mac Pro, that may
> change the answer.
> 2) Is there any support for adding a libRblas.unsafe.dylib shared
> library in the R installation, much like libRblas.veclib.dylib is
> currently included in OS X binaries? Then we could just change the
> OpenMx shared library to use the unsafe library when we give it to
> users. We currently change the OpenMx shared library to use the
> reference blas implementation, because it is faster than the veclib
> implementation for small matrices.
Although it would be possible technically, it would mean to include Rblas built by a different version of R which doesn't sound right (the flags are set by configure so there is no easy way to reproduce it in the current R without manual intervention). However, I'm considering adding some alternatives (such as ATLAS for example) in the build and a tool to change the BLAS used - possibly as a separate package.
To get back to your problem - did you try the single threaded ATLAS for your problem? The optimizations are not necessarily just for large matrices - in fact ATLAS has several paths depending on the size of the problem. (I'm traveling at the moment so can't run any tests)
However, as John pointed out it is pretty much impossible to get a general solution because any benchmarks are very limited to a certain problem and size yet the mechanisms involved are far more complex (and as I was pointing out for the recently discussed speed patches - seeming clear optimization to increase speed can in reality be slower that the untouched code depending on compilers, flags and architecture).
More information about the R-devel