[Rd] function can permanently modify calling function via substitute?

Luke Tierney luke at stat.uiowa.edu
Fri Sep 26 19:02:13 CEST 2008


On Wed, 24 Sep 2008, Luke Tierney wrote:

> On Wed, 24 Sep 2008, Peter Dalgaard wrote:
>
>> Perry de Valpine wrote:
>>> Dear R-devel:
>>> 
>>> The following code seems to allow one function to permanently modify a
>>> calling function.  I did not expect this would be allowed (short of
>>> more creative gymnastics) and wonder if it is really intended.  (I can
>>> see other ways to accomplish the intended task of this code [e.g. via
>>> match.call instead of substitute below] that do not trigger the
>>> problem, but I don't think that is the point.)
>>> 
>>> do.nothing <- function(blah) {force(blah)}
>>> 
>>> do.stuff.with.call <- function(mycall) {
>>>   raw.mycall <- substitute(mycall);   # expected raw.mycall would be local
>>>   print( sys.call() )
>>>
>>>   # do.nothing( raw.mycall );  # See below re: commented lines.
>>>   # .Call( "showNAMED", raw.mycall[[2]] )
>>>
>>>   force( mycall );  # not relevant where (or whether) this is done
>>>   raw.mycall[[2]] <- runif(1); # permanently modifies try.me on the
>>> first time only
>>>
>>>   # .Call( "showNAMED", raw.mycall[[2]] )
>>>
>>>   raw.mycall
>>> }
>>> 
>>> gumbo <- function(x) {
>>>   writeLines( paste( "gumbo : x =" ,  x ) )
>>>   return(x);
>>> }
>>> 
>>> try.me <- function() {
>>>   one.val <- 111;
>>>   one.ans <- do.stuff.with.call( mycall = gumbo( x = one.val ) );
>>>   one.ans
>>> }
>>> 
>>> # after source()ing the above:
>>> 
>>>> deparse(try.me)
>>>> 
>>> [1] "function () "
>>> [2] "{"
>>> [3] "    one.val <- 111"
>>> [4] "    one.ans <- do.stuff.with.call(mycall = gumbo(x = one.val))"
>>> [5] "    one.ans"
>>> [6] "}"
>>> 
>>>> try.me()
>>>> 
>>> do.stuff.with.call(mycall = gumbo(x = one.val))
>>> gumbo : x = 0.396524668671191
>>> gumbo(x = 0.396524668671191)
>>> 
>>>> deparse(try.me)
>>>> 
>>> [1] "function () "
>>> [2] "{"
>>> [3] "    one.val <- 111"
>>> [4] "    one.ans <- do.stuff.with.call(mycall = gumbo(x = 
>>> 0.396524668671191))"
>>> [5] "    one.ans"
>>> [6] "}"
>>> 
>>>> try.me()
>>>> 
>>> do.stuff.with.call(mycall = gumbo(x = 0.396524668671191))
>>> gumbo : x = 0.396524668671191
>>> gumbo(x = 0.0078618151601404)
>>> 
>>>> deparse(try.me)
>>>> 
>>> [1] "function () "
>>> [2] "{"
>>> [3] "    one.val <- 111"
>>> [4] "    one.ans <- do.stuff.with.call(mycall = gumbo(x = 
>>> 0.396524668671191))"
>>> [5] "    one.ans"
>>> [6] "}"
>>> 
>>> So, after the first call of try.me(), do.stuff.with.call has
>>> permanently replaced the name one.val in line 2 of try.me with a
>>> numeric (0.396...).  Subsequent calls from try.me to
>>> do.stuff.with.call now reflect that change, but do.stuff.with.call
>>> does not modify the try.me object again. (Note this means one needs to
>>> keep reloading try.me to investigate).
>>> 
>>> If this is a problem worth investigating, here are a couple of other
>>> observations that may be relevant but are obviously speculative.
>>> 
>>> 1. If the third line of do.stuff.with.call is uncommented (and try.me
>>> also reloaded), the unexpected behavior does not occur.  Since
>>> do.nothing is eponymous, I was surprised because I believed it should
>>> not impact any other behavior.  Speculating with limited knowledge, I
>>> thought this might implicate something that is supposed to stay
>>> under-the-hood, such as the "`call by value' illusion" described in
>>> the "R internals" documentation.
>>> 
>>> 2. Poking slightly further, I looked at the NAMED values using this C
>>> code via R CMD SHLIB and dyn.load:
>>> #include "R.h"
>>> #include "Rdefines.h"
>>> SEXP showNAMED(SEXP obj) {
>>>   Rprintf("%i\n", NAMED(obj));
>>>   return(R_NilValue);
>>> }
>>> Uncommenting the .Call lines in do.stuff.with.call (with the
>>> do.nothing line re-commented) reveals that on the first time
>>> do.stuff.with.call is called from try.me, raw.mycall[[2]] has NAMED ==
>>> 1 both before and after the `[[<-` line.  On subsequent calls it has
>>> NAMED == 2 before and NAMED == 1 after.  If I follow how NAMED is
>>> used, this seems relevant.
>>> 
>>> 
>> Yes and no. This does sound like a bug and NAMED is likely involved, but I 
>> don't think raw.mycall[[2]] is the thing to look at. More likely, the issue 
>> is that raw.mycall[ itself has NAMED == 1 because otherwise [[<- assignment 
>> would duplicate it first. This suggests that substitute has the bug.
>
> Our extraction functions, like [[, bump up the NAMED value for
> components to the value for the container (or to 2 -- doesn't look
> like we are consistent here).  substitute() doesn't do that, and
> perhaps could.  But arguably it is the point where the promise (from
> which substitute gets the expression) is created that is the
> extraction point. We could have mkPromise test for NAMED == 2 and bump
> up if it isn't.  We could also have parse create all LANGSXPs with
> NAMED == 2 but that leaves out programmatically created functions.
> Either change fixes this bug; not sure which is the best one (or
> whether we should do both).  Changing mkPromise is more conservative
> and potentially a little more costly but probably not enough to
> notice.
>
> luke

I decided it is safest to make the change in mkPromise but that we
might as well also make the change in substitute, so both are in as of
rev 46573.

luke

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list