[Rd] some questions about R internal SEXP types

Dan Kortschak d@n+r-deve| @end|ng |rom kort@ch@k@|o
Tue Sep 8 11:47:30 CEST 2020


Thanks, Tomas.

This is unfortunate. Calling between Go and C is not cheap; the gc
implementation of the Go compiler (as opposed to gccgo) uses different
calling conventions from C and there are checks to ensure that Go
allocated memory pointers do not leak into C code. For this reason I
wanted to avoid these if at all possible (I cannot for allocations
since I don't want to keep tracking changes in how R implements its GC
and allocation).

However, if SEXP type behaviour of the standard types, and how
attributes are handled are not highly mobile, I think that what I'm
doing will be OK - at worst the Go code will panic and result in an R
error. The necessary interface to R for allocations is only eight
functions[1].

Note that there is a lot in WRE that's beyond what I want rgo to be
able to do (calling in to R from Go for example). In fact, there's just
a lot in WRE (it's almost 3 times the length of the Go language spec
and memory model reference combined). The issues around weak references
and external pointers are not something that I want to deal with;
working with that kind of object is not idiomatic for Go (in fact
without using C.malloc, R external pointers from Go would be forbidden
by the Go runtime) and I would not expect that they are likely to be
used by people writing extensions for R in Go.

Dan

[1]


https://github.com/rgonomic/rgo/blob/2ce7717c85516bbfb94d0b5c7ef1d9749dd1f817/sexp/r_internal.go#L86-L118

On Tue, 2020-09-08 at 11:07 +0200, Tomas Kalibera wrote:
> The general principle is that R packages are only allowed to use what
> is
> documented in the R help (? command) and in Writing R Extensions. The
> former covers what is allowed from R code in extensions, the latter
> mostly what is allowed from C code in extensions (with some
> references
> to Fortran).
> 
> If you are implementing a Go interface for writing R packages, such
> Go
> interface should thus only use what is in the R help and in Writing R
> Extensions. Otherwise, packages would not be able to use such
> interface.
> 
> What is described in R Internals is for understanding the internal
> structure of R implementation itself, so for development of R itself,
> it
> could help indeed also debugging of R itself and in some cases
> debugging
> or performance analysis of extensions. R Internals can help in giving
> an
> intuition, but when people are implementing R itself, they also need
> to
> check the code. R Internals does not describe any interface for
> external
> code, if it states any constraints about say pairlists, etc, take it
> as
> an intuition for what has been intended and probably holds or held at
> some level of abstraction, but you need to check the source code for
> the
> details, anyway (e.g., at some very low level CAR and CDR can be any
> SEXP or R_NilValue, locally in some functions even C NULL).
> Internally,
> some C code uses C NULL SEXPs, but it is rare and local, and again,
> only
> the interface described in Writing R Extensions is for external use.
> 
> WRE speaks about "R NULL", "R NULL object" or "C NULL" in some cases
> to
> avoid confusion, e.g. for values types as "void *". SEXPs that
> packages
> obtain using the interface in WRE should not be C NULL, only R NULL
> (R_NilValue). External pointers can become C NULL and this is
> documented
> in WRE 5.13.
> 
> Best
> Tomas
> 
> On 9/6/20 3:44 AM, Dan Kortschak via R-devel wrote:
> > Hello,
> > 
> > I am writing an R/Go interoperability tool[1] that work similarly
> > to
> > Rcpp; the tool takes packages written in Go and performs the
> > necessary
> > Go type analysis to wrap the Go code with C and R shims that allow
> > the
> > Go code to then be called from R. The system is largely complete
> > (with
> > the exception of having a clean approach to handling generalised
> > attributes in the easy case[2] - the less hand holding case does
> > handle
> > these). Testing of some of the code is unfortunately lacking
> > because of
> > the difficulties of testing across environments.
> > 
> > To make the system flexible I have provided an (intentionally
> > incomplete) Go API into the R internals which allows reasonably Go
> > type-safe interaction with SEXP values (Go does not have unions, so
> > this is uglier than it might be otherwise and unions are faked with
> > Go
> > interface values). For efficiency reasons I've avoided using R
> > internal
> > calls where possible (accessors are done with Go code directly, but
> > allocations are done in R's C code to avoid having to duplicate the
> > garbage collection mechanics in Go with the obvious risks of error
> > and
> > possible behaviour skew in the future).
> > 
> > In doing this work I have some questions that I have not been able
> > to
> > find answers for in the R-ints doc or hadley/r-internals.
> > 
> >     1. In R-ints, the LISTSXP SEXP type CDR is said to hold
> > "usually"
> >        LISTSXP or NULL. What does the "usually" mean here? Is it
> > possible
> >        for the CDR to hold values other than LISTSXP or NULL, and
> > is
> >        this NULL NILSXP or C NULL? I assume that the CAR can hold
> > any type
> >        of SEXP, is this correct?
> >     2. The LANGSXP and DOTSXP types are lists, but the R-ints
> > comments on
> >        them do not say whether the CDR of one of these lists is the
> > same at
> >        the head of the list of devolves to a LISTSXP. Looking
> > through the
> >        code suggests to me that functions that allocate these two
> > types
> >        allocate a LISTSXP and then change only the head of the list
> > to be
> >        the LANGSXP or DOTSXP that's required, meaning that the tail
> > of the
> >        list is all LISTSXP. Is this correct?
> > 
> > The last question is more a question of interest in design
> > strategy,
> > and the answer may have been lost to time. In order to reduce the
> > need
> > to go through Go's interface assertions in a number of cases I have
> > decided to reinterpret R_NilValue to an untyped Go nil (this is
> > important for example in list traversal where the CDR can
> > (hopefully)
> > be only one of two types LISTSXP or NILSXP; in Go this would
> > require a
> > generalised SEXP return, but by doing this reinterpretation I can
> > return a *List pointer which may be nil, greatly simplifying the
> > code
> > and improving the performance). My question her is why a singleton
> > null
> > value was chosen to be represented as a fully allocated SEXP value
> > rather than just a C NULL. Also, whether C NULL is used to any
> > great
> > extent within the internal code. Note that the Go API provides a
> > mechanism to easily reconvert the nil's used back to a R_NilValue
> > when
> > returning from a Go function[3].
> > 
> > thanks
> > Dan Kortschak
> > 
> > [1]https://github.com/rgonomic/rgo
> > [2]https://github.com/rgonomic/rgo/issues/1
> > [3]
> > 

https://pkg.go.dev/github.com/rgonomic/rgo/sexp?tab=doc#Value.Export
> > 
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 



More information about the R-devel mailing list