[Rd] some questions about R internal SEXP types

Tue Sep 8 11:07:39 CEST 2020

The general principle is that R packages are only allowed to use what is 
documented in the R help (? command) and in Writing R Extensions. The 
former covers what is allowed from R code in extensions, the latter 
mostly what is allowed from C code in extensions (with some references 
to Fortran).

If you are implementing a Go interface for writing R packages, such Go 
interface should thus only use what is in the R help and in Writing R 
Extensions. Otherwise, packages would not be able to use such interface.

What is described in R Internals is for understanding the internal 
structure of R implementation itself, so for development of R itself, it 
could help indeed also debugging of R itself and in some cases debugging 
or performance analysis of extensions. R Internals can help in giving an 
intuition, but when people are implementing R itself, they also need to 
check the code. R Internals does not describe any interface for external 
code, if it states any constraints about say pairlists, etc, take it as 
an intuition for what has been intended and probably holds or held at 
some level of abstraction, but you need to check the source code for the 
details, anyway (e.g., at some very low level CAR and CDR can be any 
SEXP or R_NilValue, locally in some functions even C NULL). Internally, 
some C code uses C NULL SEXPs, but it is rare and local, and again, only 
the interface described in Writing R Extensions is for external use.

WRE speaks about "R NULL", "R NULL object" or "C NULL" in some cases to 
avoid confusion, e.g. for values types as "void *". SEXPs that packages 
obtain using the interface in WRE should not be C NULL, only R NULL 
(R_NilValue). External pointers can become C NULL and this is documented 
in WRE 5.13.

Best
Tomas

On 9/6/20 3:44 AM, Dan Kortschak via R-devel wrote:
> Hello,
>
> I am writing an R/Go interoperability tool[1] that work similarly to
> Rcpp; the tool takes packages written in Go and performs the necessary
> Go type analysis to wrap the Go code with C and R shims that allow the
> Go code to then be called from R. The system is largely complete (with
> the exception of having a clean approach to handling generalised
> attributes in the easy case[2] - the less hand holding case does handle
> these). Testing of some of the code is unfortunately lacking because of
> the difficulties of testing across environments.
>
> To make the system flexible I have provided an (intentionally
> incomplete) Go API into the R internals which allows reasonably Go
> type-safe interaction with SEXP values (Go does not have unions, so
> this is uglier than it might be otherwise and unions are faked with Go
> interface values). For efficiency reasons I've avoided using R internal
> calls where possible (accessors are done with Go code directly, but
> allocations are done in R's C code to avoid having to duplicate the
> garbage collection mechanics in Go with the obvious risks of error and
> possible behaviour skew in the future).
>
> In doing this work I have some questions that I have not been able to
> find answers for in the R-ints doc or hadley/r-internals.
>
>     1. In R-ints, the LISTSXP SEXP type CDR is said to hold "usually"
>        LISTSXP or NULL. What does the "usually" mean here? Is it possible
>        for the CDR to hold values other than LISTSXP or NULL, and is
>        this NULL NILSXP or C NULL? I assume that the CAR can hold any type
>        of SEXP, is this correct?
>     2. The LANGSXP and DOTSXP types are lists, but the R-ints comments on
>        them do not say whether the CDR of one of these lists is the same at
>        the head of the list of devolves to a LISTSXP. Looking through the
>        code suggests to me that functions that allocate these two types
>        allocate a LISTSXP and then change only the head of the list to be
>        the LANGSXP or DOTSXP that's required, meaning that the tail of the
>        list is all LISTSXP. Is this correct?
>
> The last question is more a question of interest in design strategy,
> and the answer may have been lost to time. In order to reduce the need
> to go through Go's interface assertions in a number of cases I have
> decided to reinterpret R_NilValue to an untyped Go nil (this is
> important for example in list traversal where the CDR can (hopefully)
> be only one of two types LISTSXP or NILSXP; in Go this would require a
> generalised SEXP return, but by doing this reinterpretation I can
> return a *List pointer which may be nil, greatly simplifying the code
> and improving the performance). My question her is why a singleton null
> value was chosen to be represented as a fully allocated SEXP value
> rather than just a C NULL. Also, whether C NULL is used to any great
> extent within the internal code. Note that the Go API provides a
> mechanism to easily reconvert the nil's used back to a R_NilValue when
> returning from a Go function[3].
>
> thanks
> Dan Kortschak
>
> [1]https://github.com/rgonomic/rgo
> [2]https://github.com/rgonomic/rgo/issues/1
> [3]https://pkg.go.dev/github.com/rgonomic/rgo/sexp?tab=doc#Value.Export
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel