[Rd] [External] undefined symbol errors when compiling package using ALTREP API

Mark Klik m@rkk||k @end|ng |rom gm@||@com
Wed Jun 5 00:29:25 CEST 2019


Hi Gabriel,

thanks for your detailed explanation, that definitely clarifies the design
choices that were made in setting up the ALTREP framework and I can see how
those choices make sure existing code won't break.

My specific use-case for wanting to check whether a vector is an ALTREP is
the following: the fst package wraps an external C++ library (fstlib,
independent from R) that was made for high speed serialization of
dataframe's. Sequences are fairly common in dataframe's and I'm planning to
add the concept of a sequence to the (R-agnostic) fst format. When I can
detect, e.g. a 'compact_intseq' ALTREP vector and just retrieve it's 3
integer internal representation, serialization could be very fast.
Alternatively, as you describe, the vector needs to be expanded first
before serialization, which will actually be slower than using an already
expanded vector and can take a lot of RAM for large datasets.

So being able to make use of the internal representation of (a few of the)
base ALTREP vectors can be very interesting for (non-R) serialization
schemes.

thanks for your time!
Mark


On Tue, Jun 4, 2019 at 11:50 PM Gabriel Becker <gabembecker using gmail.com>
wrote:

> Hi Mark,
>
> So depending pretty strongly on what you mean by "ALTREP aware", packages
> aren't necessarily supposed to be ALTREP aware. What I mean by this is that
> as of right now, ALTREP objects are designed to be interacted with by
> non-ALTREP-implementing package code, *more-or-less *exactly as standard
> (non-AR) SEXPs are: via the published C API. The more or less comes from
> the fact that in some cases, doing things that are good ideas on standard
> SEXPS will work, but may not be a good idea for ALTREPs.
>
> The most "low-hanging-fruit" example of something that was best practice
> for standard vectors but is not a good idea for ALTREP vectors is grabbing
> a DATAPTR and iterating over the values without modification in a tight
> loop.  This will work (absent allocation  failure or, I suppose, the ALTREP
> being specifically designed to refuse to give you a full DATAPTR), but with
> ALTREP in place its no longer what you want to do.
>
> That said, you don't want to check whether something is an ALTREP yourself
> and branch your code, what you want to do is use the ITERATE_BY_REGION
> macro in R_ext/Itermacros.h for ALL SEXPs, which will be nearly as for
> standard vectors and work safely for ALTREP vectors.
>
> Basically any time you find yourself wanting to check if something is an
> ALTREP and if so, call a specific ALT*_BLAH method, the intention is that
> there should be a universal API point you can call which will work for both
> types.
>
> This is true, e.g., of INTEGER_IS_SORTED (which will always work and just
> returns UNKNOWN_SORTEDNESS, ie INT_MIN, ie NA_INTEGER for non-ALTREPs).,
> for REAL_GET_REGION, (which populates a double* with the requested values
> for both standard and ALTREP REALSXPs), etc.
>
> Does the above make sense?
>
> If you feel a universal API point is missing, you can raise that here,
> though I can't promise that will ultimately result in the method being
> added.
>
> Best,
> ~G
>
> On Tue, Jun 4, 2019 at 2:22 PM Mark Klik <markklik using gmail.com> wrote:
>
>> thanks for clearing that up, so these methods are actually not meant to be
>> exported on Windows and OSX?
>> Some of the ALTREP methods that now use 'attribute_hidden' would be very
>> useful to packages that aim to be ALTREP aware, should the currently
>> (exported) API be considered final?
>>
>> thanks  for your time & best,
>> Mark
>>
>> On Tue, Jun 4, 2019 at 6:52 PM Tierney, Luke <luke-tierney using uiowa.edu>
>> wrote:
>>
>> > On Tue, 4 Jun 2019, Mark Klik wrote:
>> >
>> > > Hello,
>> > >
>> > > I'm developing a package (lazyvec) that makes full use of the ALTREP
>> > > framework (R >= 3.6.0).
>> > > One application of the package is to wrap existing ALTREP vectors in a
>> > new
>> > > ALTREP vector and pass all calls from R to the contained object. The
>> > > purpose of this is to provide a diagnostic framework for working with
>> > > ALTREP vectors and show information about internal calls.
>> > >
>> > > The package builds on Windows and OSX but fails to build on Linux as
>> can
>> > be
>> > > seen from the link to the Travis build:
>> > > https://travis-ci.org/fstpackage/lazyvec/jobs/539442806
>> > >
>> > > The reason of build failure is that many ALTREP methods generate
>> > 'undefined
>> > > symbol' errors upon building the package (on Linux). I've checked the
>> R
>> > > source code and the undefined symbols seems to be related to the
>> > > 'attribute_hidden' before the function definition. For example, the
>> > method
>> > > 'ALTVEC_EXTRACT_SUBSET' is defined as:
>> > >
>> > > SEXP attribute_hidden ALTVEC_EXTRACT_SUBSET(SEXP x, SEXP indx, SEXP
>> call)
>> > >
>> > > My question is why these differences between Windows / OSX and Linux
>> > exist
>> > > and if they are intentional?
>> >
>> > It is intentional that this not be part of the public API. This is
>> > true of almost all functions with an ALTREP prefix. You need a
>> > different approach that avoids using these directly.
>> >
>> > Best,
>> >
>> > luke
>> >
>> > > Do I need special build parameters to make sure my package builds
>> > correctly
>> > > on Linux?
>> > >
>> > > thanks for all the hard work!
>> > >
>> > > best,
>> > > Mark
>> > >
>> > > PS: some additional info:
>> > >
>> > > package github repository: https://github.com/fstpackage/lazyvec
>> > > AppVeyor package build logs:
>> > > https://ci.appveyor.com/project/fstpackage/lazyvec
>> > > Travis package build logs:
>> > https://travis-ci.org/fstpackage/lazyvec/builds
>> > >
>> > >       [[alternative HTML version deleted]]
>> > >
>> > > ______________________________________________
>> > > R-devel using r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > >
>> >
>> > --
>> > Luke Tierney
>> > Ralph E. Wareham Professor of Mathematical Sciences
>> > University of Iowa                  Phone:             319-335-3386
>> > Department of Statistics and        Fax:               319-335-3017
>> >     Actuarial Science
>> > 241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
>> > Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list