[Rd] ATLAS threaded 64 bit Opteron build for R: need -fPIC

Fri Feb 27 19:22:29 MET 2004

On 27 Feb 2004, Douglas Bates wrote:

> Martin Maechler <maechler at stat.math.ethz.ch> writes:
> 
> > >>>>> "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk>
> > >>>>>     on 26 Feb 2004 15:44:16 +0100 writes:
> > 
> >     PD> Douglas Bates <bates at stat.wisc.edu> writes:
> >     >> Have you tried configuring R with Goto's BLAS
> >     >> http://www.cs.utexas.edu/users/kgoto/
> >     >> 
> >     >> I haven't worked with Opteron or Athlon64 computers but I understand
> >     >> that Goto's BLAS are very effective on those machines.  Furthermore
> >     >> Goto's BLAS are (only) available as .so libraries so you don't need to
> >     >> mess with creating the .so version.
> > 
> >     PD> I tried it, yes. Somewhat to my surprise, it seemed to be not quite as
> >     PD> fast as the threaded ATLAS, but I wasn't very systematic about the
> >     PD> benchmarking.
> > 
> >     PD> (and the Goto items have license issues, which get in the way for
> >     PD> binary distributions.)
> > 
> > Thanks a lot, Peter, Brian, Doug, for your feedbacks!
> > In the mean time, I have three running versions of R(-devel) on
> > the 64-Opteron
> > - "plain"
> > - linked against threaded GOTO
> > - linked against threaded (static) ATLAS  (using -fPIC for compilation;
> > 					   "large" Rlapack)
> > and I find that GOTO is faster than ATLAS
> > consistently (between ~ 5-20%) for several tests
> > (square matrices; %*% and solve).
> > ATLAS is still an order of magnitude faster than "plain" for
> > 3000x3000 matrices.
> 
> Would you be willing to post a brief summary of comparative timings?
> 
> I have thought at times that it may be worthwhile collecting
> comparative timings for different combinations of
>                  processor/OS/memory size and speed/
> on "typical" tasks in R.  As with any benchmark the results will
> artificial but they can be of some help when considering what hardware
> to purchase.  Bioconductor users may find it particularly helpful to
> be able to evaluate how much they will need to pay to be able to
> analyze large data sets reasonably quickly.
> 
> One easily-obtained timing is at the end of
> $RSRC/tests/Examples/base-Ex.Rout after 'make;make check'.

That one is I think rather too artificial, as it contains few even
moderately large examples, and is dominated by a few atypical tasks.

I tend to use the sum of the MASS scripts as an informal timing: ch06.R is 
also a pretty good indicator.

I think you will find that BLAS differences are pretty small in real-life 
analyses, or at least I always have.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595