[Rd] R problems with lapack with gfortran

Thomas König tk @end|ng |rom tkoen|g@net
Wed Apr 24 23:32:16 CEST 2019


I have tried to pinpoint potential problems which could lead to the
LAPACK issues that are currently seen in R.  I built the current R
trunk using

AR=gcc-ar RANLIB=gcc-ranlib ./configure --prefix=$HOME --enable-lto 
--enable-BLAS-shlib=no --without-recommended-packages

and used this to find problem areas.

There are quite a few warnings that were flagged, due to mismatches
in function types.

The prototypes that R has in its header files, for example BLAS.h,
are often not compatible with gfortran function declarations.  To take
one small example, in src/main/print.c, we have

void NORET F77_NAME(xerbla)(const char *srname, int *info)

so xerbla_ is defined with two arguments.

However, gfortran passes string lengths as hidden arguments.
You can see this by compiling the small example

$ cat xer.f
$ gfortran -c -fdump-tree-original xer.f
$ cat xer.f.004t.original
foo ()
   integer(kind=4) info;

   xerbla (&"FOO"[1]{lb: 1 sz: 1}, &info, 3);

so here we have three arguments. This mismatch is flagged
by -Wlto-type-mismatch, which, for example, yields

print.c:1120:12: note: type 'void' should match type 'long int'
../../src/extra/blas/blas.f:357:20: warning: type of 'xerbla' does not 
match original declaration [-Wlto-type-mismatch]
   357 |          CALL XERBLA( 'DGBMV ', INFO )

So, why can gcc's r268992 / r269349 matter? Before these patches,
gfortran used the variadic calling convention for calling procedures
outside the current file, and the non-variadic calling convention for
calling procedures found in the current file.

Because the procedures were all compiled as non-variadic, the caller and
the calle's signature did not match if they were not in the same
source file, which is an ABI violation.

This violation manifested itself in https://gcc.gnu.org/PR87689 ,
where the the problem resulted in crashes on a primary gcc platform,

How can this potentially affect R?  After the fix for PR87689,
gfortran's calls to external procedures are no longer variadic.  It is
quite possible that, while this "works" most of the time, there
is a problem with a particular LAPACK routine, the call sequence
leading up to it or the procedures it calls.

How to fix this problem?  The only clear way I see is to fix this
on the R side, by adding the string lengths to the prototypes.
These are size_t (64 bit on 64-bit systems, 32 bit on 32-bit
systems).  You should then try to make --enable-lto pass
without any warnings.

Regarding LAPACK itself, the default build system for R builds
it as a shared library.  Offhand, I did not see any way to
build a *.a file instead, so I could not use LTO to check
for mismatched prototypes between R and LAPACK.

Of course, I cannot be sure that this is really the root cause
of the problem you are seeing,but it does seem to fit quite well.
I hope this analysis helps in resolving this.



More information about the R-devel mailing list