[Rd] Including multiple third party libraries in an extension

Simon Urbanek simon.urbanek at r-project.org
Mon Nov 14 16:01:52 CET 2011


On Nov 13, 2011, at 9:55 PM, Tyler Pirtle wrote:

> 
> 
> On Sun, Nov 13, 2011 at 6:25 PM, Simon Urbanek <simon.urbanek at r-project.org> wrote:
> 
> On Nov 13, 2011, at 6:48 PM, Tyler Pirtle wrote:
> 
> >
> >
> > On Sun, Nov 13, 2011 at 7:27 AM, Uwe Ligges <ligges at statistik.tu-dortmund.de> wrote:
> >
> >
> > On 13.11.2011 05:22, Tyler Pirtle wrote:
> > On Sat, Nov 12, 2011 at 8:08 PM, Tyler Pirtle<rtp at google.com>  wrote:
> >
> > Thanks Simon, a few replies...
> >
> > On Sat, Nov 12, 2011 at 6:14 AM, Simon Urbanek<
> > simon.urbanek at r-project.org>  wrote:
> >
> > Tyler,
> >
> > On Nov 11, 2011, at 7:55 PM, Tyler Pirtle wrote:
> >
> > Hi,
> >
> > I've got a C extension structured roughly like:
> >
> > package/
> >  src/
> >    Makevars
> >    foo.c
> >    some-lib/...
> >    some-other-lib/..
> >
> > where foo.c and Makevars define dependencies on some-lib and
> > some-other-lib. Currently I'm having
> > Makevars configure;make install some-lib and some-other-lib into a local
> > build directory, which produces
> > shard libraries that ultimately I reference for foo.o in PKG_LIBS.
> >
> > I'm concerned about distribution. I've setup the appropriate magic with
> > rpath for the packages .so
> >
> > That is certainly non-portable and won't work for a vast majority of
> > users.
> >
> >
> > Yea I figured, but apparently I have other, more pressing problems.. ;)
> >
> >
> >
> > (meaning
> > that when the final .so is produced the dynamic libraries dependencies
> > on
> > some-lib and some-other-lib
> > will prefer the location built in src/some-lib/... and
> > src/some-other-lib/... But does this preclude me from
> > being able to distribute a binary package?
> >
> > Yes. And I doubt the package will work the way you described it at all,
> > because the "deep" .so won't be even installed. Also there are potential
> > issues in multi-arch R (please consider testing that as well).
> >
> >
> > Understood. I wasn't a fan of any of the potential solutions I'd seen (one
> > of wich included source-only availability).
> > I've seen some other folks using the inst/ or data/ dirs for purposes like
> > this, but I agree it's ugly and has
> > issues. You raise a great point, too, about multi-arch R. I have potential
> > users that are definitely on
> > heterogeneous architectures, I noticed that when I R CMD INSTALL --build .
> > to check my current build,
> > I end up with a src-${ARCH} for both x86_64 and i386 - is there more
> > explicit multiarch testing I should be
> > doing?
> >
> >
> >
> > If I do want to build a binary
> > distribution, is there a way I can
> > package up everything needed, not just the resulting .so?
> >
> > Or, are there better ways to bundle extension-specific third party
> > dependencies? ;) I'd rather not have
> > my users have to install obscure libraries globally on their systems.
> >
> >
> > Typically the best solution is to compile the dependencies as
> > --disable-shared --enable-static --with-pic (in autoconf speak - you don't
> > need to actually use autoconf). That way your .so has all its dependencies
> > inside and you avoid all run-time hassle. Note that it is very unlikely
> > that you can take advantage of the dynamic nature of the dependencies
> > (since no one else knows about them anyway) so there is not real point to
> > build them dynamically.
> >
> >
> > That is a much better solution and the one I've been looking for! I was
> > afraid I'd have to manually specific all the dependency objects but if I
> > just disable
> > shared than that makes much more sense, I can let the compiler and linker
> > do the work for me.
> >
> >
> > Also note that typically you want to use the package-level configure to
> > run subconfigures, and *not* Makevars. (There may be reasons for an
> > exception to that convention, but you need to be aware of the differences
> > in multi-arch builds since Makevars builds all architectures at once from
> > separate copies of the src directories whereas the presence of configure
> > allows you to treat your package as one architecture at a time and you can
> > pass-though parameters).
> >
> >
> > Understood. Is src/ still the appropriate directory then for my third
> > party packages? Also, do you happen to know of any packages off-hand that I
> > can use
> > as a reference?
> >
> > Thanks Simon! Your insights here are invaluable. I really appreciate it.
> >
> >
> >
> > Tyler
> >
> >
> >
> > Ah, also a few more questions...
> >
> > I don't really understand the flow for developing multi-arch extensions.
> > Does configure run only once?
> >
> > Depends on the platform. For example: If you are on Windows and have a configure.win, you can tell R to run it for each architecture: See the R Installation and Administration manual  and also
> >
> > R CMD INSTALL --help which has, e.g., under Windows:
> >
> >  --force-biarch    attempt to build both architectures
> >                    even if there is a non-empty configure.win
> >
> >
> >
> > Once per arch? What is the state of
> > src-${ARCH} by the time the src/Makevars or Makefile is executed? Is any of
> > this actually in the manual and am I just missing it? ;)
> >
> > The Makevars/-file is executed for each architecture.
> >
> >
> >
> > And why does R_ARCH start with a '/'? ;)
> >
> > It is typically used as part of a path's name.
> >
> > Uwe Ligges
> >
> >
> > Thanks Uwe, very helpful stuff. I have the problem that I can't configure all my
> > third party packages at once since they're inter-dependent, so I have to deal with
> > R_ARCH in my Makefile.
> >
> 
> You should not need to since it's irrelevant for you as a package author, it is used internally by R. (Also note that Makevars are preferred to Makefile since it is much more fragile to re-create the R build process in the latter and thus the latter is only used in very special circumstances)
> 
> 
> That explains a few details then, I thought I was ultimately responsible for producing binaries, but as you pointed out below
> thats not the case...
>  And I misspoke - I'm using a Makevars, I saw the warning elsewhere as well. 
>  
> > I'm afraid I don't understand at all how portability is managed with respect to packages. I mean,
> > I'm not sure how multi-arch and CRAN all sort of fit together to make my package ultimately
> > available via binary distribution to users an all sorts of platforms. How does all this work?
> >
> 
> As long as your code is portable and you use R's facilities (instead of creating your own), it's all automatic. Packages are built on each platform separately and then distributed on CRAN. To answer your previous question: for multi-arch platforms (on CRAN that is Windows and Mac OS X) the package is built separately for each architecture if your package contains configure or Makefile. Otherwise it is built in one go (see R-admin 6.3.4).
> 
> 
> I guess thats the interesting question, is my code portable? Thats something else that I don't fully understand, why are all architectures built
> if configure or Makefile are missing? I guess I don't really understand the purpose of multiple sub-architectures (maybe, for example, if I were
> on windows and building both natively and with cygwin? Is that the purpose?).

No. Several OSes support multiple architectures, for example Mac OS X 10.5 supports PowerPC and Intel, each of them with 32-bit or 64-bit. That gives a total of 4 architectures: i386, x86_64, ppc and ppc64. Therefore R binary for that platform has to support multiple architectures so the way this is done is to keep only one set of non-binary files and several sets of binary files, each for one architecture. On Windows there are 32-bit and 64-bit binaries (i386 and x64) so there are two sets of binary files - 32-bit and for 64-bit. This allows common distribution without the mess of having multiple builds for each architecture.


> I'm not sure I get you when you say its "built in one go" - what is? My package? It seems to be building just my (guessed) arch as well.
> 

No, if you don't have configure and Makefile, R will build every architecture it supports, e.g.:

* installing *source* package 'fastmatch' ...
** package 'fastmatch' successfully unpacked and MD5 sums checked
** libs
*** arch - i386
gcc-4.2 -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386  -I/usr/local/include    -fPIC  -g -O2 -c fastmatch.c -o fastmatch.o
gcc-4.2 -arch i386 -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o fastmatch.so fastmatch.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /private/tmp/rl/fastmatch/libs/i386
*** arch - ppc
gcc-4.2 -arch ppc -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/ppc  -I/usr/local/include    -fPIC  -g -O2 -c fastmatch.c -o fastmatch.o
gcc-4.2 -arch ppc -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o fastmatch.so fastmatch.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /private/tmp/rl/fastmatch/libs/ppc
*** arch - x86_64
gcc-4.2 -arch x86_64 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/x86_64  -I/usr/local/include    -fPIC  -g -O2 -c fastmatch.c -o fastmatch.o
gcc-4.2 -arch x86_64 -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o fastmatch.so fastmatch.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /private/tmp/rl/fastmatch/libs/x86_64
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded

* DONE (fastmatch)


As you can see, it compiled and installed the binaries for all three architectures (Intel 32-bit, PowerPC 32-bin and Intel 64-bit -- we don't support ppc64 anymore). That is possible since R is taking care of all the building so it can do the right thing with me even having to specify what yo do.

Cheers,
Simon


>  
> > Say I can test and am willing to support certain architectures and certain OS distributions,
> > say Mac OS X, Linux, Windows, etc. and I can verify that my package builds in those
> > environments (under some minimal set of conditions). What is CRAN's purpose then?
> >
> > Am I meant to submit a binary build for each arch/OS as separate packages?
> >
> 
> You're not supposed to supply any binaries. CRAN builds them from the sources you provide.
> 
> 
> Thanks for the clarification.
> 
> 
>  
> Cheers,
> Simon
> 
> 
> > My apologies for these questions, I'm quite new to this community, and all of your
> > help has been amazing, I really do appreciate it. Please point me at any relevant
> > documentation as well, I'm happy to go read.
> >
> > Hopefully I can contribute something back in a timely fashion here that will be
> > helpful to a wider audience ;)
> >
> >
> > Thanks,
> >
> >
> > Tyler
> >
> >
> >
> >
> > thanks again!
> >
> >
> > Tyler
> >
> >
> >
> >
> >
> >
> > Cheers,
> > Simon
> >
> >
> >
> >
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> 
> 



More information about the R-devel mailing list