[Rd] request: allow inline functions in R

Li Long lilong at isb-sib.ch
Fri May 14 16:02:06 CEST 2004


Hi, R core developers,

I work in the Swiss Institute of Bioinformatics.  We have two clusters of
Intel Itanium2 clusters for bioinformaticians to crank their data.  One
piece of software they use heavily is R and BioConductors.  I ported the R
codes and R packages to this platform already, and am working on
optimizing their performance.  I'm using Intel C/C++ compiler on this
platform running Linux.  One of my findings is that turning some functions
in R to "inline" functions boost performance significantly.

While R follows strict C89 standard right now, there're quite some good
reasons to relax the rule somewhat.  From my experience in software
development in industry, I understand very well both the portability issue
and backward compatability issue, I also see the hidden cost of holding
back for too long and not fully achieving the potential of new technology,
I recommend that we allow "inline" functions in R's C codes.

The following explains why I recommend the above.

In modern processor microarchitecture, pipelining is a major approach to
achieve higher clock speed.  Super-pipelining involves pipelining the
microarchitecture to finer granularities.  With far more instructions
in-flight in a super-pipelined microarchitecture, handling of events that
disrupt the pipeling, such as cache mises, interrupts and branch
misprediction, can be costly.

A case in point is the Intel Itanium architecture, EPIC (explicitly
parallel intruction computing).  EPIC enables programmer or compiler to
indicate the inherent parallelism of programs *explicitly* in the
instruction sequence.  The main features to improve performance are:
application registers, predication, branching and register rotation.  The
implication is the cost of disrupting the pipeline is magnified greatly on
this architecture.

In R, there are quite some simple functions that are called extremely
often, such as "R_IsNaNorNA", "R_finite", etc.  They are used in heavy
loops quite a lot.  They disrupt the pipelining, and negatively affect the
performance of the software.  For instance, on IA64, system call of
"isnan" cost 4 cycles, while a wrapper like "R_IsNaNorNA" could cost
several times more.

One way to reduce this kind of disruption in C/C++ is to "inline" a
function, i.e., to integrate it into the code for its callers, eliminating
the function-call overhead.
The benefits from inlining comes especially with very short functions.

On unix and linux, we could find inline functions in standard .h files in
/usr/include or /usr/local/include.

C++ supports "inline" functions from beginning, while "inline" keyword was
introduced in C99 standard in 1999.  A feature that has been in standard
for so many years is considered very mature in computer industry. Many C++
compilers actually translate C++ codes to C codes, so it's quite natural
for corresponding C compilers to support inline functions.  The compilers
could choose to generate function calls or to inline the functions, so
this feature poses little risk to the application.

The default compilers that R uses, gcc/g++, support it at least since
version 2.95 in Jul 1999.  GCC User's manual states that it "works" only
in optimizing compilation for "gcc/g++".

Since R calls for C, C++, FORTRAN compilers, it's no surprise to expect
that "inline" functions are allowed.  This will not only improve the
performance of R on modern processors with little effort, but also
encourage people to develop and use R packages on more challenging
problems.

In configure-step, R checks for many OS/compiler-related issues, this
could be just one more check.  I expect that the initial use of inline
functions are mainly for small but heavily used functions, so the impact
of such change could be managed.

The attachments are from GCC User's Manual and C99 rationale, regarding
"inline" functions.

Thanks for considering this issue.

Li Long


-------------- next part --------------
>From C99 Rationale
==================

6.4.1 Keywords
Several keywords were added in C89: const, enum, signed, void and volatile. New in
C9X are the keywords inline, restrict, _Bool, _Complex and _Imaginary.
Where possible, however, new features have been added by overloading existing keywords, as, for
example, long double instead of extended. It is recognized that each added keyword will
require some existing code that used it as an identifier to be rewritten. No meaningful programs are
known to be quietly changed by adding the new keywords.



6.7.4     Function specifiers
A new feature of C99: The inline keyword, adapted from C++, is a function-specifier that
can be used only in function declarations. It is useful for program optimizations that require the
definition of a function to be visible at the site of a call. (Note that the Standard does not
attempt to specify the nature of these optimizations.)
Visibility is assured if the function has internal linkage, or if it has external linkage and the call
is in the same translation unit as the external definition. In these cases, the presence of the
inline keyword in a declaration or definition of the function has no effect beyond indicating a
preference that calls of that function should be optimized in preference to calls of other
functions declared without the inline keyword.
Visibility is a problem for a call of a function with external linkage where the call is in a
different translation unit from the function's definition. In this case, the inline keyword
allows the translation unit containing the call to also contain a local, or inline, definition of the
function.
A program can contain a translation unit with an external definition, a translation unit with an
inline definition, and a translation unit with a declaration but no definition for a function. Calls
in the latter translation unit will use the external definition as usual.
An inline definition of a function is considered to be a different definition than the external
definition. If a call to some function func with external linkage occurs where an inline
definition is visible, the behavior is the same as if the call were made to another function, say
__func, with internal linkage. A conforming program must not depend on which function is
called. This is the inline model in the Standard.
A conforming program must not rely on the implementation using the inline definition, nor may
it rely on the implementation using the external definition. The address of a function is always
the address corresponding to the external definition, but when this address is used to call the
function, the inline definition might be used. Therefore, the following example might not
behave as expected.
       inline const char *saddr(void)
       {     static const char name[] = "saddr";
             return name;
       }
       int compare_name(void)
       {     return saddr() == saddr(); // unspecified behavior
       }
Since the implementation might use the inline definition for one of the calls to saddr and use
the external definition for the other, the equality operation is not guaranteed to evaluate to 1
(true). This shows that static objects defined within the inline definition are distinct from their

corresponding object in the external definition. This motivated the constraint against even
defining a non-const object of this type.
Inlining was added to the Standard in such a way that it can be implemented with existing linker
technology, and a subset of C99 inlining is compatible with C++. This was achieved by
requiring that exactly one translation unit containing the definition of an inline function be
specified as the one that provides the external definition for the function. Because that
specification consists simply of a declaration that either lacks the inline keyword, or contains
both inline and extern, it will also be accepted by a C++ translator.
Inlining in C99 does extend the C++ specification in two ways. First, if a function is declared
inline in one translation unit, it need not be declared inline in every other translation unit.
This allows, for example, a library function that is to be inlined within the library but available
only through an external definition elsewhere. The alternative of using a wrapper function for
the external function requires an additional name; and it may also adversely impact performance
if a translator does not actually do inline substitution.
Second, the requirement that all definitions of an inline function be "exactly the same" is
replaced by the requirement that the behavior of the program should not depend on whether a
call is implemented with a visible inline definition, or the external definition, of a function.
This allows an inline definition to be specialized for its use within a particular translation unit.
For example, the external definition of a library function might include some argument
validation that is not needed for calls made from other functions in the same library. These
extensions do offer some advantages; and programmers who are concerned about compatibility
can simply abide by the stricter C++ rules.
Note that it is not appropriate for implementations to provide inline definitions of standard
library functions in the standard headers because this can break some legacy code that
redeclares standard library functions after including their headers. The inline keyword is
intended only to provide users with a portable way to suggest inlining of functions. Because the
standard headers need not be portable, implementations have other options along the lines of:
       #define abs(x) __builtin_abs(x)
or other non-portable mechanisms for inlining standard library functions.
-------------- next part --------------
>From GCC 2.95.3 Manual
======================

 4.31 An Inline Function is As Fast As a Macro

By declaring a function inline, you can direct GNU CC to integrate that function's code into the code for its callers. This makes execution faster by eliminating the function-call overhead; in addition, if any of the actual argument values are constant, their known values may permit simplifications at compile time so that not all of the inline function's code needs to be included. The effect on code size is less predictable; object code may be larger or smaller with function inlining, depending on the particular case. Inlining of functions is an optimization and it really "works" only in optimizing compilation. If you don't use `-O', no function is really inline.

To declare a function inline, use the inline keyword in its declaration, like this:

 	

inline int
inc (int *a)
{
  (*a)++;
}

(If you are writing a header file to be included in ANSI C programs, write __inline__ instead of inline. See section 4.35 Alternate Keywords.) You can also make all "simple enough" functions inline with the option `-finline-functions'.

Note that certain usages in a function definition can make it unsuitable for inline substitution. Among these usages are: use of varargs, use of alloca, use of variable sized data types (see section 4.14 Arrays of Variable Length), use of computed goto (see section 4.3 Labels as Values), use of nonlocal goto, and nested functions (see section 4.4 Nested Functions). Using `-Winline' will warn when a function marked inline could not be substituted, and will give the reason for the failure.

Note that in C and Objective C, unlike C++, the inline keyword does not affect the linkage of the function.

GNU CC automatically inlines member functions defined within the class body of C++ programs even if they are not explicitly declared inline. (You can override this with `-fno-default-inline'; see section Options Controlling C++ Dialect.)

When a function is both inline and static, if all calls to the function are integrated into the caller, and the function's address is never used, then the function's own assembler code is never referenced. In this case, GNU CC does not actually output assembler code for the function, unless you specify the option `-fkeep-inline-functions'. Some calls cannot be integrated for various reasons (in particular, calls that precede the function's definition cannot be integrated, and neither can recursive calls within the definition). If there is a nonintegrated call, then the function is compiled to assembler code as usual. The function must also be compiled as usual if the program refers to its address, because that can't be inlined.

When an inline function is not static, then the compiler must assume that there may be calls from other source files; since a global symbol can be defined only once in any program, the function must not be defined in the other source files, so the calls therein cannot be integrated. Therefore, a non-static inline function is always compiled on its own in the usual fashion.

If you specify both inline and extern in the function definition, then the definition is used only for inlining. In no case is the function compiled on its own, not even if you refer to its address explicitly. Such an address becomes an external reference, as if you had only declared the function, and had not defined it.

This combination of inline and extern has almost the effect of a macro. The way to use it is to put a function definition in a header file with these keywords, and put another copy of the definition (lacking inline and extern) in a library file. The definition in the header file will cause most calls to the function to be inlined. If any uses of the function remain, they will refer to the single copy in the library.

GNU C does not inline any functions when not optimizing. It is not clear whether it is better to inline or not, in this case, but we found that a correct implementation when not optimizing was difficult. So we did the easy thing, and turned it off. 

 4.35 Alternate Keywords

The option `-traditional' disables certain keywords; `-ansi' disables certain others. This causes trouble when you want to use GNU C extensions, or ANSI C features, in a general-purpose header file that should be usable by all programs, including ANSI C programs and traditional ones. The keywords asm, typeof and inline cannot be used since they won't work in a program compiled with `-ansi', while the keywords const, volatile, signed, typeof and inline won't work in a program compiled with `-traditional'.

The way to solve these problems is to put `__' at the beginning and end of each problematical keyword. For example, use __asm__ instead of asm, __const__ instead of const, and __inline__ instead of inline.

Other C compilers won't accept these alternative keywords; if you want to compile with another compiler, you can define the alternate keywords as macros to replace them with the customary keywords. It looks like this:

 	

#ifndef __GNUC__
#define __asm__ asm
#endif

`-pedantic' causes warnings for many GNU C extensions. You can prevent such warnings within one expression by writing __extension__ before the expression. __extension__ has no effect aside from this.


More information about the R-devel mailing list