[Rd] Including multiple third party libraries in an extension

Uwe Ligges ligges at statistik.tu-dortmund.de
Mon Nov 14 09:40:37 CET 2011



On 14.11.2011 03:55, Tyler Pirtle wrote:
> On Sun, Nov 13, 2011 at 6:25 PM, Simon Urbanek
> <simon.urbanek at r-project.org>wrote:
>
>>
>> On Nov 13, 2011, at 6:48 PM, Tyler Pirtle wrote:
>>
>>>
>>>
>>> On Sun, Nov 13, 2011 at 7:27 AM, Uwe Ligges<
>> ligges at statistik.tu-dortmund.de>  wrote:
>>>
>>>
>>> On 13.11.2011 05:22, Tyler Pirtle wrote:
>>> On Sat, Nov 12, 2011 at 8:08 PM, Tyler Pirtle<rtp at google.com>   wrote:
>>>
>>> Thanks Simon, a few replies...
>>>
>>> On Sat, Nov 12, 2011 at 6:14 AM, Simon Urbanek<
>>> simon.urbanek at r-project.org>   wrote:
>>>
>>> Tyler,
>>>
>>> On Nov 11, 2011, at 7:55 PM, Tyler Pirtle wrote:
>>>
>>> Hi,
>>>
>>> I've got a C extension structured roughly like:
>>>
>>> package/
>>>   src/
>>>     Makevars
>>>     foo.c
>>>     some-lib/...
>>>     some-other-lib/..
>>>
>>> where foo.c and Makevars define dependencies on some-lib and
>>> some-other-lib. Currently I'm having
>>> Makevars configure;make install some-lib and some-other-lib into a local
>>> build directory, which produces
>>> shard libraries that ultimately I reference for foo.o in PKG_LIBS.
>>>
>>> I'm concerned about distribution. I've setup the appropriate magic with
>>> rpath for the packages .so
>>>
>>> That is certainly non-portable and won't work for a vast majority of
>>> users.
>>>
>>>
>>> Yea I figured, but apparently I have other, more pressing problems.. ;)
>>>
>>>
>>>
>>> (meaning
>>> that when the final .so is produced the dynamic libraries dependencies
>>> on
>>> some-lib and some-other-lib
>>> will prefer the location built in src/some-lib/... and
>>> src/some-other-lib/... But does this preclude me from
>>> being able to distribute a binary package?
>>>
>>> Yes. And I doubt the package will work the way you described it at all,
>>> because the "deep" .so won't be even installed. Also there are potential
>>> issues in multi-arch R (please consider testing that as well).
>>>
>>>
>>> Understood. I wasn't a fan of any of the potential solutions I'd seen
>> (one
>>> of wich included source-only availability).
>>> I've seen some other folks using the inst/ or data/ dirs for purposes
>> like
>>> this, but I agree it's ugly and has
>>> issues. You raise a great point, too, about multi-arch R. I have
>> potential
>>> users that are definitely on
>>> heterogeneous architectures, I noticed that when I R CMD INSTALL --build
>> .
>>> to check my current build,
>>> I end up with a src-${ARCH} for both x86_64 and i386 - is there more
>>> explicit multiarch testing I should be
>>> doing?
>>>
>>>
>>>
>>> If I do want to build a binary
>>> distribution, is there a way I can
>>> package up everything needed, not just the resulting .so?
>>>
>>> Or, are there better ways to bundle extension-specific third party
>>> dependencies? ;) I'd rather not have
>>> my users have to install obscure libraries globally on their systems.
>>>
>>>
>>> Typically the best solution is to compile the dependencies as
>>> --disable-shared --enable-static --with-pic (in autoconf speak - you
>> don't
>>> need to actually use autoconf). That way your .so has all its
>> dependencies
>>> inside and you avoid all run-time hassle. Note that it is very unlikely
>>> that you can take advantage of the dynamic nature of the dependencies
>>> (since no one else knows about them anyway) so there is not real point to
>>> build them dynamically.
>>>
>>>
>>> That is a much better solution and the one I've been looking for! I was
>>> afraid I'd have to manually specific all the dependency objects but if I
>>> just disable
>>> shared than that makes much more sense, I can let the compiler and linker
>>> do the work for me.
>>>
>>>
>>> Also note that typically you want to use the package-level configure to
>>> run subconfigures, and *not* Makevars. (There may be reasons for an
>>> exception to that convention, but you need to be aware of the differences
>>> in multi-arch builds since Makevars builds all architectures at once from
>>> separate copies of the src directories whereas the presence of configure
>>> allows you to treat your package as one architecture at a time and you
>> can
>>> pass-though parameters).
>>>
>>>
>>> Understood. Is src/ still the appropriate directory then for my third
>>> party packages? Also, do you happen to know of any packages off-hand
>> that I
>>> can use
>>> as a reference?
>>>
>>> Thanks Simon! Your insights here are invaluable. I really appreciate it.
>>>
>>>
>>>
>>> Tyler
>>>
>>>
>>>
>>> Ah, also a few more questions...
>>>
>>> I don't really understand the flow for developing multi-arch extensions.
>>> Does configure run only once?
>>>
>>> Depends on the platform. For example: If you are on Windows and have a
>> configure.win, you can tell R to run it for each architecture: See the R
>> Installation and Administration manual  and also
>>>
>>> R CMD INSTALL --help which has, e.g., under Windows:
>>>
>>>   --force-biarch    attempt to build both architectures
>>>                     even if there is a non-empty configure.win
>>>
>>>
>>>
>>> Once per arch? What is the state of
>>> src-${ARCH} by the time the src/Makevars or Makefile is executed? Is any
>> of
>>> this actually in the manual and am I just missing it? ;)
>>>
>>> The Makevars/-file is executed for each architecture.
>>>
>>>
>>>
>>> And why does R_ARCH start with a '/'? ;)
>>>
>>> It is typically used as part of a path's name.
>>>
>>> Uwe Ligges
>>>
>>>
>>> Thanks Uwe, very helpful stuff. I have the problem that I can't
>> configure all my
>>> third party packages at once since they're inter-dependent, so I have to
>> deal with
>>> R_ARCH in my Makefile.
>>>
>>
>> You should not need to since it's irrelevant for you as a package author,
>> it is used internally by R. (Also note that Makevars are preferred to
>> Makefile since it is much more fragile to re-create the R build process in
>> the latter and thus the latter is only used in very special circumstances)
>>
>>
> That explains a few details then, I thought I was ultimately responsible
> for producing binaries, but as you pointed out below
> thats not the case...
>   And I misspoke - I'm using a Makevars, I saw the warning elsewhere as
> well.
>
>
>>> I'm afraid I don't understand at all how portability is managed with
>> respect to packages. I mean,
>>> I'm not sure how multi-arch and CRAN all sort of fit together to make my
>> package ultimately
>>> available via binary distribution to users an all sorts of platforms.
>> How does all this work?
>>>
>>
>> As long as your code is portable and you use R's facilities (instead of
>> creating your own), it's all automatic. Packages are built on each platform
>> separately and then distributed on CRAN. To answer your previous question:
>> for multi-arch platforms (on CRAN that is Windows and Mac OS X) the package
>> is built separately for each architecture if your package contains
>> configure or Makefile. Otherwise it is built in one go (see R-admin 6.3.4).
>>
>>
> I guess thats the interesting question, is my code portable? Thats
> something else that I don't fully understand, why are all architectures
> built
> if configure or Makefile are missing?

Since then it is easy to do so and no 2 runs are required. With a 
configure the automagical processes cannot guess enough and need to to 
stuff separately.

> I guess I don't really understand the
> purpose of multiple sub-architectures (maybe, for example, if I were
> on windows and building both natively and with cygwin?

No. Natively 32-bit and natively 64-bit architectures. We do not provide 
any cygwin binaries (which is another platform anyway).

The difference is just in the compiled code. Hence the package is almost 
the same, it just ships with two libs subdirectories ./x64 and ./i386 
each containing the right dll(s) for the corresponding platform.


> Is that the
> purpose?). I'm not sure I get you when you say its "built in one go" - what
> is? My package? It seems to be building just my (guessed) arch as well.

In one go means:
- for packages without compiled code, it is really only one go, since 
everything should already be portable.
- for packages with compiled code, the libraries (e.g. dlls on Windows) 
are compiled separately but within the same installation procedure. The 
whole rest is only done once.

Best,
Uwe Ligges


>
>
>
>
>>> Say I can test and am willing to support certain architectures and
>> certain OS distributions,
>>> say Mac OS X, Linux, Windows, etc. and I can verify that my package
>> builds in those
>>> environments (under some minimal set of conditions). What is CRAN's
>> purpose then?
>>>
>>> Am I meant to submit a binary build for each arch/OS as separate
>> packages?
>>>
>>
>> You're not supposed to supply any binaries. CRAN builds them from the
>> sources you provide.
>>
>>
> Thanks for the clarification.
>
>
>
>
>> Cheers,
>> Simon
>>
>>
>>> My apologies for these questions, I'm quite new to this community, and
>> all of your
>>> help has been amazing, I really do appreciate it. Please point me at any
>> relevant
>>> documentation as well, I'm happy to go read.
>>>
>>> Hopefully I can contribute something back in a timely fashion here that
>> will be
>>> helpful to a wider audience ;)
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Tyler
>>>
>>>
>>>
>>>
>>> thanks again!
>>>
>>>
>>> Tyler
>>>
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>



More information about the R-devel mailing list