[R] substitute question

Sat Mar 20 04:09:53 CET 2004

Great detective work, Tony!  Thanks.

Date:   Fri, 19 Mar 2004 17:51:27 -0700 
From:   Tony Plate <tplate at blackmesacapital.com>
To:   <ggrothendieck at myway.com>, <tlumley at u.washington.edu> 
Cc:   <R-help at stat.math.ethz.ch> 
Subject:   Re: [R] substitute question 

I don't think it is correct that "there are two types of expressions and 
calls". There is only one type of these things. I believe the relevant 
distinction here is between 'call' objects (which can be informally thought 
of as the parse trees of unevaluated R code, and which formally have mode 
'call' in R) and other things like objects of mode 'function', etc.

However, this is all does pretty confusing when using substitute(), 
because, substitute() does go inside objects that have mode 'call', but 
doesn't go inside of objects that have mode 'function' or 
'expression'. What makes it more confusing is that sometimes 'call' 
objects can be wrapped up in expression objects. Note that parse(text=) 
returns a 'call' object wrapped in an 'expression' object, whereas quote() 
returns a 'call' object -- I believe that in general it is true that 
parse(text="XXX")[[1]] === quote(XXX).

Earlier in this discussion, Peter Dalgard stated "you can only do 
substitutions on language objects" and then used the function is.language() 
in an example, which I took at that time to imply that substitute() would 
go inside objects for which is.language() returned true. However, from 
experimenting, it seems that is.call() rather than is.language() is the 
appropriate test.

Here are some simple examples.

> esub <- function(expr, sublist) do.call("substitute", list(expr, sublist))
> e1 <- parse(text="a + 1")
> e2 <- quote(a + 1)
> e1
expression(a + 1)
> e2
a + 1
> mode(e1)
[1] "expression"
> mode(e2)
[1] "call"
> identical(e1[[1]], e2)
[1] TRUE
>
> # substitute() doesn't go inside e1, even though is.language(e1) is TRUE
> c(is.language(e1), is.call(e1))
[1] TRUE FALSE
> esub(e1, list(a=as.name('b')))
expression(a + 1)
>
> c(is.language(e2), is.call(e2))
[1] TRUE TRUE
> esub(e2, list(a=as.name('b')))
b + 1
>
> c(is.language(e1[[1]]), is.call(e1[[1]]))
[1] TRUE TRUE
> esub(e1[[1]], list(a=as.name('b')))
b + 1
> identical(e2, e1[[1]])
[1] TRUE
>
> ef <- Quote(function() a + 1)
> f <- function() a + 1
> c(is.language(ef), is.call(ef))
[1] TRUE TRUE
> esub(ef, list(a=as.name('b')))
function() b + 1
> c(is.language(f), is.call(f))
[1] FALSE FALSE
> esub(f, list(a=as.name('b')))
function ()
a + 1
> c(is.language(body(f)), is.call(body(f)))
[1] TRUE TRUE
> esub(body(f), list(a=as.name('b')))
b + 1
>
>

I also see that in S-plus 6.2, substitute() behaves differently -- it does 
go inside objects of mode 'call' and 'expression' and substitutes 'b' for 
'a' in every case above. To run the above code in S-plus, first do:
> body <- function(f) f[[1]]
> quote <- Quote

Although there isn't much to guide one in the documentation ?substitute, 
the "R Language manual" does have some discussion of substitute() and 
'expression' objects.

-- Tony Plate

At Thursday 10:58 PM 3/18/2004, Gabor Grothendieck wrote:
>
>Thanks. Thus it seems that there are two types of expressions and calls:
>
>1. fully expanded
>2. partially expanded
>
>and that fully expanded ones are a prerequisite for substitution.
>body() and quote() produce such fully expanded expressions.
>
>Using a small utility function we can investigate this:
>
>recurse <- function( x, idx = NULL )
> if ( length( x ) > 0 ) {
> for( i in seq( along = x ) )
> if (length(x[[i]])>1)
> Recall( x[[i]], c(idx, i))
> else {
> if (length(idx)) cat(idx,"")
> cat( i, class(x[[i]]), ":" )
> cat( rep("\t",length(idx) + 2) )
> print( x[[i]] )
> }
> }
>
>f <- function(){a+1}
>
>eb <- body(f)
>class(eb)
>recurse(eb)
>
>eq <- quote(function(){a+1})
>class(eq)
>recurse(eq)
>
>ep <- parse(text=deparse(f))
>class(ep)
>recurse(ep)
>
>
>The output that the above is shown below. It shows that
>body() and quote() produce fully expanded expression style objects
>although body's is of class { and quote is of class call.
>
>However, parse(text=deparse(f)) also produces a fully expanded
>expression style object of class expression yet substitution
>does not occur with that. Thus full vs. partial expansion is likely
>a necessary but not a sufficient condition. There is something
>else but I don't know what it is.
>
>
> > f <- function(){a+1}
> >
> > eb <- body(f)
> > class(eb)
>[1] "{"
> > recurse(eb)
>1 name : `{`
>2 1 name : `+`
>2 2 name : a
>2 3 numeric : [1] 1
> >
> > eq <- quote(function(){a+1})
> > class(eq)
>[1] "call"
> > recurse(eq) # lines begin with list indices and class name
>1 name : `function`
>2 NULL : NULL
>3 1 name : `{`
>3 2 1 name : `+`
>3 2 2 name : a
>3 2 3 numeric : [1] 1
>4 NULL : NULL
> >
> > ep <- parse(text=deparse(f))
> > class(ep)
>[1] "expression"
> > recurse(ep)
>1 1 name : `function`
>1 2 NULL : NULL
>1 3 1 name : `{`
>1 3 2 1 name : `+`
>1 3 2 2 name : a
>1 3 2 3 numeric : [1] 1
>1 4 NULL : NULL
>
>
>Date: Thu, 18 Mar 2004 17:27:20 -0800 (PST)
>From: Thomas Lumley <tlumley at u.washington.edu>
>To: Gabor Grothendieck <ggrothendieck at myway.com>
>Cc: <tplate at blackmesacapital.com>, <R-help at stat.math.ethz.ch>
>Subject: Re: [R] substitute question
>
>
>On Thu, 18 Mar 2004, Gabor Grothendieck wrote:
>
> >
> >
> > I don't think I expressed myself very well on that.
> >
> > Looking at what we get from the example:
> >
> > > z <- substitute(substitute(expression(f),list(a=quote(b))),list(f=f))
> >
> > > z
> > substitute(expression(function ()
> > {
> > a + 1
> > }), list(a = quote(b)))
> >
> > > class(z);mode(z);typeof(z)
> > [1] "call"
> > [1] "call"
> > [1] "language"
> >
> >
> > we see that the function seems to be expanded correctly and
> > the statement does produce a call object. However,
> > applying eval one, two or three times does not give what
> > you would think if you looked at z above.
>
>Maybe we didn't express ourselves well enough.
>
>Looking at z above isn't enough. z is a call to substitute().
>Its first operand is an expression. The expression contains a single term,
>which is a function.
>
>If you typed
>notz<- quote(substitute(expression(function ()
>{
>a + 1
>}), list(a = quote(b))))
>
>you would obtain something that deparsed the same as z, and so looked the
>same, but was actually different. In notz the first operand of substitute
>is an expression containing multiple terms, which if evaluated would
>return a function.
>
>substitute() goes though this expression and checks each term to see if it
>is `a`. In z there is only one term and it isn't `a`. In notz there is
>(after sufficient recursion) an `a` and it gets replaced.
>
>So
>
> > z[[2]][[2]]
>function ()
>{
>a + 1
>}
> > notz[[2]][[2]]
>function() {
>a + 1
>}
>
>are the respective operands, and they still look the same. But
>
> > mode(z[[2]][[2]])
>[1] "function"
> > mode(notz[[2]][[2]])
>[1] "call"
> > length(z[[2]][[2]])
>[1] 1
> > length(notz[[2]][[2]])
>[1] 4
>
>and if we try to find the actual `a` in there
> > notz[[2]][[2]][[3]][[2]][[2]]
>a
> > z[[2]][[2]][[3]][[2]][[2]]
>Error in z[[2]][[2]][[3]] : object is not subsettable
> >
>
>
> -thomas
>
>
>
>
>
>_______________________________________________
>No banners. No pop-ups. No kidding.
>Introducing My Way - http://www.myway.com