[Rd] must .Call C functions return SEXP?

Andrew Piskorski atp at piskorski.com
Thu Oct 28 15:48:21 CEST 2010


On Thu, Oct 28, 2010 at 12:15:56AM -0400, Simon Urbanek wrote:

> > Reason I ask, is I've written some R code which allocates two long
> > lists, and then calls a C function with .Call.  My C code writes to
> > those two pre-allocated lists,

> That's bad! All arguments are essentially read-only so you should
> never write into them! 

I don't see how.  (So, what am I missing?)  The R docs themselves
state that the main point of using .Call rather than .C is that .Call
does not do any extra copying and gives one direct access to the R
objects.  (This is indeed very useful, e.g. to reorder a large matrix
in seconds rather than hours.)

I could allocate the two lists in my C code, but so far it was more
convenient to so in R.  What possible difference in behavior can there
be between the two approaches?

> R has pass-by-value(!) semantics, so semantically you code has
> nothing to do with the result.1 and result.2 variables since only
> their *values* are guaranteed to be passed (possibly a copy).

Clearly C code called from .Call must be allowed to construct R
objects, as that's how much of R itself is implemented, and further
down, it's what you recommend I should do instead.

But why does it follow that C code must never modify an object
initially allocated by R code?  Are you saying there is some special
magic difference in the state of an object allocated by R's C code
vs. one allocated by R code?  If so, what is it?

What is the potential problem here, that the garbage collector will
suddenly run while my C code is in the middle of writing to an R list?
Yes, if the gc is going to move the object elsewhere, that would be
very bad.  But it looks to me like that cannot happen, because lots of
the R implementation itself would fail badly if it did.

E.g.:  The PROTECT call is used to increment reference counts, but I
see no guarantees that it is atomic with the operations that allocate
objects.  I see no mutexes or other barriers in C code to prevent the
gc from running, thus implying that it *can't* run until the C
function completes.

And R is single threaded, of course.  But what about signal handlers,
could they ever invoke R's gc?

Also, I was initially surprised not to find any matrix C APIs, but
grepping for examples (sorry, I don't remember exactly which
functions) showed me that the apparently accepted way to do matrix
operations from C is to simply assume R's column-first dense matrix
order, and access the 2D matrix as a flat 1D vector.  (Which is easy.)

> The fact that internally R attempts to avoid copying for performance
> reasons is the only reason why your code may have appeared to work,
> but it's invalid!

I will probably change my code to allocate a new list from the C code
and return that, as you recommend.  My main reason for doing the
allocation in R was just that it was simpler, especially given the
very limited documentation of R's C API.

But, I didn't see anything in the "Writing R Extensions" doc saying
that what my code is doing is "invalid", and more importantly, I don't
see why it would or should be invalid...

I'd still like to better understand why you think doing the initial
allocation of an object in R rather than C code is such a problem.  So
far, I don't see any way that the R interpreter could ever tell the
difference.

Wait, or is the only objection here that I'm using C in a way that
makes pass-by-reference semantics visible to my R code?  Which will
work completely correctly, but is not the The Proper R Way?

I don't actually need pass-by-reference behavior here at all, but I
can imagine cases where I might want it, so I'd like to understand
your objections better.  Is using C to implement pass-by-reference
actually Broken, or merely Ugly?  From my reasons above, I think it
will always work correctly and thus is not Broken.  But of course
given R's devotion to pass-by-value, it could be considered
unacceptably Ugly.

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/



More information about the R-devel mailing list