[Rd] [External] Possible ALTREP bug

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Thu Jun 17 20:31:55 CEST 2021


On Thu, 17 Jun 2021, Toby Hocking wrote:

> Oliver, for clarification that section in writing R extensions mentions
> VECTOR_ELT and REAL but not REAL_ELT nor any other *_ELT functions. I was
> looking for an explanation of all the *_ELT functions (which are apparently
> new), not just VECTOR_ELT.
> Thanks Simon that response was very helpful.
> One more question: are there any circumstances in which one should use
> REAL_ELT(x,i) rather than REAL(x)[i] or vice versa? Or can they be used
> interchangeably?

For a single call it is better to use REAL_ELT(x, i) since it doesn't
force allocating a possibly large object in order to get a pointer to
its data with REAL(x).  If you are iterating over a whole object you
may want to get data in chunks. There are iteration macros that
help. Some examples are in src/main/summary.c.

Best,

luke

> 
> On Wed, Jun 16, 2021 at 4:29 PM Simon Urbanek <simon.urbanek using r-project.org>
> wrote:
>       The usual quote applies: "use the source, Luke":
>
>       $ grep _ELT *.h | sort
>       Rdefines.h:#define SET_ELEMENT(x, i, val)     
>        SET_VECTOR_ELT(x, i, val)
>       Rinternals.h:   The function STRING_ELT is used as an argument
>       to arrayAssign even
>       Rinternals.h:#define VECTOR_ELT(x,i)    ((SEXP *) DATAPTR(x))[i]
>       Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i);
>       Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i);
>       Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t);
>       Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v);
>       Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i);
>       Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i);
>       Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i);
>       Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i);
>       Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i,
>       Rcomplex v);
>       Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v);
>       Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v);
>       Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v);
>       Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v);
>       Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP);
>       Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v);
>       Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v);
>       Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v);
>       Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v);
>
>       So the indexing is with R_xlen_t and they return the value
>       itself as one would expect.
>
>       Cheers,
>       Simon
> 
>
>       > On Jun 17, 2021, at 2:22 AM, Toby Hocking <tdhock5 using gmail.com>
>       wrote:
>       >
>       > By the way, where is the documentation for INTEGER_ELT,
>       REAL_ELT, etc? I
>       > looked in Writing R Extensions and R Internals but I did not
>       see any
>       > mention.
>       > REAL_ELT is briefly mentioned on
>       > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html
>       > Would it be possible to please add some mention of them to
>       Writing R
>       > Extensions?
>       > - how many of these _ELT functions are there? INTEGER, REAL,
>       ... ?
>       > - in what version of R were they introduced?
>       > - I guess input types are always SEXP and int?
>       > - What are the output types for each?
>       >
>       > On Fri, May 28, 2021 at 5:16 PM <luke-tierney using uiowa.edu>
>       wrote:
>       >
>       >> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly
>       new it may
>       >> be possible to check that places where they are used allow
>       for them to
>       >> allocate. I have fixed the one that got caught by Gabor's
>       example, and
>       >> a rchk run might be able to pick up others if rchk knows
>       these could
>       >> allocate. (I may also be forgetting other places where the
>       _ELt
>       >> methods are used.)  Fixing all call sites for REAL, INTEGER,
>       etc, was
>       >> never realistic so there GC has to be suspended during the
>       method
>       >> call, and that is done in the dispatch mechanism.
>       >>
>       >> The bigger problem is jumps from inside things that existing
>       code
>       >> assumes will not do that. Catching those jumps is possible
>       but
>       >> expensive; doing anything sensible if one is caught is really
>       not
>       >> possible.
>       >>
>       >> Best,
>       >>
>       >> luke
>       >>
>       >> On Fri, 28 May 2021, Gabriel Becker wrote:
>       >>
>       >>> Hi Jim et al,
>       >>> Just to hopefully add a bit to what Luke already answered,
>       from what I am
>       >>> recalling looking back at that bioconductor thread Elt
>       methods are used
>       >> in
>       >>> places where there are hard implicit assumptions that no
>       garbage
>       >> collection
>       >>> will occur (ie they are called on things that aren't
>       PROTECTed), and
>       >> beyond
>       >>> that, in places where there are hard assumptions that no
>       error (longjmp)
>       >>> will occur. I could be wrong, but I don't know that
>       suspending garbage
>       >>> collection would protect from the second one. Ie it is
>       possible that an
>       >>> error *ever* being raised from R code that implements an elt
>       method could
>       >>> cause all hell to break loose.
>       >>>
>       >>> Luke or Tomas Kalibera would know more.
>       >>>
>       >>> I was disappointed that implementing ALTREPs in R code was
>       not in the
>       >> cards
>       >>> (it was in my original proposal back in 2016 to the DSC) but
>       I trust Luke
>       >>> that there are important reasons we can't safely allow that.
>       >>>
>       >>> Best,
>       >>> ~G
>       >>>
>       >>> On Fri, May 28, 2021 at 8:31 AM Jim Hester
>       <james.f.hester using gmail.com>
>       >> wrote:
>       >>>      From reading the discussion on the Bioconductor issue
>       tracker it
>       >>>      seems like
>       >>>      the reason the GC is not suspended for the non-string
>       ALTREP Elt
>       >>>      methods is
>       >>>      primarily due to performance concerns.
>       >>>
>       >>>      If this is the case perhaps an additional flag could be
>       added to
>       >>>      the
>       >>>      `R_set_altrep_*()` functions so ALTREP authors could
>       indicate if
>       >>>      GC should
>       >>>      be halted when that particular method is called for
>       that
>       >>>      particular ALTREP
>       >>>      class.
>       >>>
>       >>>      This would avoid the performance hit (other than a
>       boolean
>       >>>      check) for the
>       >>>      standard case when no allocations are expected, but
>       allow
>       >>>      authors to
>       >>>      indicate that R should pause GC if needed for methods
>       in their
>       >>>      class.
>       >>>
>       >>>      On Fri, May 28, 2021 at 9:42 AM
>       <luke-tierney using uiowa.edu> wrote:
>       >>>
>       >>>> integer and real Elt methods are not expected to allocate.
>       You
>       >>>      would
>       >>>> have to suspend GC to be able to do that. This currently
>       can't
>       >>>      be done
>       >>>> from package code.
>       >>>>
>       >>>> Best,
>       >>>>
>       >>>> luke
>       >>>>
>       >>>> On Fri, 28 May 2021, Gábor Csárdi wrote:
>       >>>>
>       >>>>> I have found some weird SEXP corruption behavior with
>       >>>      ALTREP, which
>       >>>>> could be a bug. (Or I could be doing something wrong.)
>       >>>>>
>       >>>>> I have an integer ALTREP vector that calls back to R from
>       >>>      the Elt
>       >>>>> method. When this vector is indexed in a lapply(), its
>       first
>       >>>      element
>       >>>>> gets corrupted. Sometimes it's just a type change to
>       >>>      logical, but
>       >>>>> sometimes the corruption causes a crash.
>       >>>>>
>       >>>>> I saw this on macOS from R 3.5.3 to 4.2.0. I created a
>       small
>       >>>      package
>       >>>>> that demonstrates this:
>       >>>      https://github.com/gaborcsardi/redfish
>       >>>>>
>       >>>>> The R callback in this package calls
>       >>>      `loadNamespace("Matrix")`, but
>       >>>>> the same crash happens for other packages as well, and
>       >>>      sometimes it
>       >>>>> also happens if I don't load any packages at all. (But
>       that
>       >>>      example
>       >>>>> was much more complicated, so I went with the package
>       >>>      loading.)
>       >>>>>
>       >>>>> It is somewhat random, and sometimes turning off the JIT
>       >>>      avoids the
>       >>>>> crash, but not always.
>       >>>>>
>       >>>>> Hopefully I am just doing something wrong in the ALTREP
>       code
>       >>>      (see
>       >>>>>
>       >>>     
>       https://github.com/gaborcsardi/redfish/blob/main/src/test.c),
>       >>>      and it
>       >>>>> is not actually a bug.
>       >>>>>
>       >>>>> Thanks,
>       >>>>> Gabor
>       >>>>>
>       >>>>> ______________________________________________
>       >>>>> R-devel using r-project.org mailing list
>       >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>       >>>>>
>       >>>>
>       >>>> --
>       >>>> Luke Tierney
>       >>>> Ralph E. Wareham Professor of Mathematical Sciences
>       >>>> University of Iowa                  Phone:
>       >>>       319-335-3386
>       >>>> Department of Statistics and        Fax:
>       >>>       319-335-3017
>       >>>>    Actuarial Science
>       >>>> 241 Schaeffer Hall                  email:
>       >>>       luke-tierney using uiowa.edu
>       >>>> Iowa City, IA 52242                 WWW:
>       >>>      http://www.stat.uiowa.edu
>       >>>> ______________________________________________
>       >>>> R-devel using r-project.org mailing list
>       >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>       >>>>
>       >>>
>       >>>              [[alternative HTML version deleted]]
>       >>>
>       >>>      ______________________________________________
>       >>>      R-devel using r-project.org mailing list
>       >>>      https://stat.ethz.ch/mailman/listinfo/r-devel
>       >>>
>       >>>
>       >>>
>       >>
>       >> --
>       >> Luke Tierney
>       >> Ralph E. Wareham Professor of Mathematical Sciences
>       >> University of Iowa                  Phone:           
>        319-335-3386
>       >> Department of Statistics and        Fax:             
>        319-335-3017
>       >>    Actuarial Science
>       >> 241 Schaeffer Hall                  email: 
>        luke-tierney using uiowa.edu
>       >> Iowa City, IA 52242                 WWW: 
>       http://www.stat.uiowa.edu
>       >> ______________________________________________
>       >> R-devel using r-project.org mailing list
>       >> https://stat.ethz.ch/mailman/listinfo/r-devel
>       >>
>       >
>       >       [[alternative HTML version deleted]]
>       >
>       > ______________________________________________
>       > R-devel using r-project.org mailing list
>       > https://stat.ethz.ch/mailman/listinfo/r-devel
>       >
> 
> 
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list