[Rd] ATLAS threaded 64 bit Opteron build for R: need -fPIC

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Feb 10 12:14:17 CET 2006

On Fri, 10 Feb 2006, Amit Aronovitch wrote:

You set the reply address to Martin Maechler!  That's antisocial.

> Hi,
> Sorry for sending such a late reply, and for being abit OT.
>  I've been trying to compile 64 bit ATLAS for numpy 
> (http://numeric.scipy.org/ ), and so far this thread is the most useful 
> one I could google up - thanks!.
>  I encountered similiar problems, and so far could not get a .a linkable 
> to numpy (comparing to your post - it seems I might have forgotten to 
> add the -fPIC for the F77FLAGS or MMFLAGS).

Yes, that _is_ in the R-admin manual.  I guess you have not read that - it 
describes how to install R.  You can get it in the R tarball from


> Also, I'm having trouble with the ATLAS lapack. To get a usable lib, one 
> has to merge it with a full lapack implementation (as described in the 
> ATLAS errata). However, I'm using RHEL4, and their installed liblapack.a 
> seems to have been compiled without -fPIC, so the merged library is 
> unlinkable to numpy's .so. Is there a way to use Redhat's installed 
> liblapack.so?

No, nor should you want to.  If RHEL4 is like FC3/4 watch out, as RH have 
managed to get BLAS routines in liblapack and not liblas, and use 
incorrect patches to LAPACK 3.0.  (Again, see the latest R-admin manual.)

> Few questions about your compiler flags:
> 1) Is there a reason to compile with -O rather than -O3?
> (did you try and encounter some problem, or found no major performance
> difference)

ATLAS chose that.  Since the real work is done by hand-tuned assembler 
code it should not matter.

> 2) I see you use -mfpmath=387 - does this work better than sse2 (which
> seems to be
> the default)? How about the "sse,387" option - should I try that?

Depends on your ATLAS version.  Again, ATLAS chose those.

As it happens, I have been trying to build ATLAS on my new dual Opteron 
box this morning.  The latest devel version (3.7.11) does not build, as at 
some point it says it expects the GNU x86-32 assembler.  If it did it 
would use SSE3 and so be faster.

Both 3.6.0 and 3.7.11 fail because my machine is too fast, and I had to 
increase the number of replications (1000) in make/Make.{mv,r1}tune and in 
tune/blas/level1/*.c.  Even then I do not entirely trust the results (and 
the two versions report different L1 caches sizes ...).

I got pretty exasperated with this (it needed about ten builds to get one 
that succeeded).  Both ACML and the Goto BLAS work well out of the box on 
Opterons, but do have licence issues. (Again, see the R-admin manual for 

> Martin Maechler wrote:
>>>>>> / "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk <https://www.stat.math.ethz.ch/mailman/listinfo/r-devel>>
> />>>>>/  >>>>>> "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk>
>>>>>>>     on 26 Feb 2004 15:44:16 +0100 writes:
>>    PD> Douglas Bates <bates at stat.wisc.edu> writes:
>>   >> Have you tried configuring R with Goto's BLAS
>>   >> http://www.cs.utexas.edu/users/kgoto/
>>   >>
>>   >> I haven't worked with Opteron or Athlon64 computers but I understand
>>   >> that Goto's BLAS are very effective on those machines.  Furthermore
>>   >> Goto's BLAS are (only) available as .so libraries so you don't need to
>>   >> mess with creating the .so version.
>>    PD> I tried it, yes. Somewhat to my surprise, it seemed to be not quite as
>>    PD> fast as the threaded ATLAS, but I wasn't very systematic about the
>>    PD> benchmarking.
>>    PD> (and the Goto items have license issues, which get in the way for
>>    PD> binary distributions.)
>> Thanks a lot, Peter, Brian, Doug, for your feedbacks!
>> In the mean time, I have three running versions of R(-devel) on
>> the 64-Opteron
>> - "plain"
>> - linked against threaded GOTO
>> - linked against threaded (static) ATLAS  (using -fPIC for compilation;
>> 					   "large" Rlapack)
>> and I find that GOTO is faster than ATLAS
>> consistently (between ~ 5-20%) for several tests
>> (square matrices; %*% and solve).
>> ATLAS is still an order of magnitude faster than "plain" for
>> 3000x3000 matrices.
>> Here are somewhat repeatable "ATLAS for R" build instructions:
>> 1. get ATLAS source; unpack
>> 2. make : use defaults and "express" installation
>> 3. Before "make install ...", edit the  Make.<ARCHITECTURE> file:
>>    add "-fPIC" to three places, namely  F77FLAGS, CCFLAG0, and MMFLAGS:
>>    which in case of the "threaded Opteron" architecture, leads to
>>    the three new lines
>>       F77FLAGS = -fPIC -fomit-frame-pointer -O -m64
>> 	CCFLAG0 = -fPIC -fomit-frame-pointer -O -mfpmath=387 -m64
>> 	MMFLAGS = -fPIC -fomit-frame-pointer -O -mfpmath=387 -m64
>>    in the file   Make.Linux_HAMMER64SSE2_2
>> 4. make install arch=Linux_HAMMER64SSE2_2
>> 5. Sym.link the ATLAS libraries into /usr/local/lib:
>>    cd /usr/local/lib
>>    ln -s <ATLAS_build_dir>/lib/Linux_HAMMER64SSE2_2/lib* .
>> 6. (needed for runtime!):
>>    Use environment variable LD_LIBRARY_PATH=/usr/local/lib
>> Note that I haven't built *.so (shared) libraries yet.
> /

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-devel mailing list