[Rd] multiple definitions in C code

Peter Dalgaard BSA p.dalgaard@biostat.ku.dk
03 Jan 2002 00:44:22 +0100


Jan de Leeuw <deleeuw@stat.ucla.edu> writes:

> There is a problem in MacOS X with multiple definitions of the same
> symbol in different files that will be put into the same bundle or
> dynamic library by the dynamic linker. It occurs, for example, in
> the rpart package, which includes rpart.h in all its source files,
> and rpart.h has definitions of a structure rp and functions rp_init
> and so on. I think the same problem occurs in the grid package --
> all 150+ others seem to be fine. Just out of curiosity, I looked into
> the problem a bit, in particular as it relates to dyld in OS X.
> 
> ==========================================================================
> ====
> According to the ANSI Standard
> 
> G.5 Common Extensions
> 
> The following extensions are widely used in many systems, but are not
> portable to all implementations. The inclusion of any extension that
> may cause
> a strictly conforming program to become invalid renders an implementation
> non-conforming.
> 
> ...
> 
> G.5.11 Multiple external definitions
> 
> There may be more than one external definition for the identifier of
> an object,
> with or without the explicit use of the keyword extern. If the
> definitions
> disagree, or more than one is initialized, the behavior is undefined.
> ==========================================================================
> ====
> See also Summit, C Programming FAQ, section 1.7, or Koenig, C Traps and
> Pitfalls, pages 54-56. Portability considerations makes them suggest
> that the
> safe way to proceed is to have only a single definition, in a source
> file, with
> or without initialization, and everywhere else (in header files) external
> declarations.


Yes. The standard way is to do something like this in the include file

#ifndef extern
#define extern extern
#endif

extern struct {
...
} rp;

and then in *one* of the files precede the inclusion with 

#define extern

However, as I read G.5.11, the way it is being used in rpart is in
fact in accordance with ANSI C. It says that behaviour is well-defined
except in case of conflicts.

> ==========================================================================
> ====
>  From the MacOS X ld man page:
> 
> Different linking can occur only when there  is  more
> than  one  definition  of a symbol and the library modules
> that contain the definitions for that symbol do not define
> and  reference  exactly  the  same symbols.  In this case,
> even different executions of the same program can  produce
> different  linking  because the dynamic linker binds unde-
> fined functions as they are called, and this  affects  the
> order  in  which  undefined symbols are bound.  Because it
> can  produce  different  dynamic  linking,  using  dynamic
> shared  libraries that define the same symbols in the same
> program is strongly discouraged.
> ==========================================================================
> ====
>  From Inside Mac OS X (page 134)
> 
> When you create a framework, you must ensure that each symbol
> is defined only once in a library. In addition, "common" symbols
> are not allowed in the library, you must use a single true definition
> and precede all other definitions with the extern key word in C
> code.
> ==========================================================================
> ====
> 
> Thus we see that the dynamic linker sets things up for "lazy
> linking", where symbols are loaded at run time if they are actually
> needed (and not at startup). It is different, I think, from Solaris
> (see Solaris Porting Guide, p. 102-103), which also has "lazy
> linking", but allows multiple definitions ("the first one wins").

The really nasty thing about this is if two different packages happen
to use the same symbol. I seem to recall that this is why there is no
OpenOffice for MacOS X, since they had an explicit requirement that
each of their plugin modules contain a specifically named
initialization routine! 

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._