R-beta: S Compatibility (again)

Mon Apr 13 23:42:05 CEST 1998

Luke Tierney writes:
 > The need to make heavy use of substitute or eval is, in my
 > view, not a strengh of a language.  The right tool for this
 > job, in my view at least, is lexical scoping and closures. R
 > provides these directly, so a much simpler definition is
 > 
 > pdf.order<-
 > function (n, r, pfun, dfun) 
 > {
 >   con <- round(exp(lgamma(n + 1) - lgamma(r) - lgamma(n - r + 1)))
 >   function(x) {
 >     Fx <- pfun(x)
 >     con * Fx^(r - 1) * (1 - Fx)^(n - r) * dfun(x)
 >   }
 > }
 > 
 > > pdf.order(9, 5, pnorm, dnorm)(0)
 > [1] 0.981772

Aha!  Very neat indeed.  This is one of those examples that make
concepts like lexical scoping and closure much clearer, (at least
to me).

 > The free variables con, n, r, pfun and dfun in the returned
 > function refer to the variables in the defining environment
 > (this is lexical scope).

Yes.  I notice the result is a function "with environment".  It
would be interesting to see how these objects are structured and
how big they were, but I see R draws a discrete veil over this by
declaring functions "not subsetable".  I presume, though, that
all objects in the environment must be copied as part of the
created object, and not just the names essentially as pointers as
the S substitute() solution effectively does.

 > 
 > Since the free variables in the value function are not
 > modified you can do this in S if you abstact out the closure
 > creation operation into a function MC,
 > 
 > pdf.order<-
 > function (n, r, pfun, dfun) 
 > {
 >   con <- round(exp(lgamma(n + 1) - lgamma(r) - lgamma(n - r + 1)))
 >   MC(function(x) {
 >        Fx <- pfun(x)
 >        con * Fx^(r - 1) * (1 - Fx)^(n - r) * dfun(x)
 >      },
 >      list("con"=con, "pfun"=pfun, "dfun"=dfun, "r"=r, "n"=n))
 > }
 > 
 > > pdf.order(9, 5, pnorm, dnorm)(0)
 > [1] 0.981772
 > 
 > This isn't as clean as in R but it is clearer in its intent,
 > to me at least, than the eval/substitute stuff.
 > 
 > The definition of MC (make closure) I use is not based on
 > substitute.  Substitute, no matter how sophisticated, is in my
 > view fundamentally the wrong tool for this job since it does
 > not understand syntax.  (It is a reasonable tool in this case
 > since the function is very small, but in general it is not).
 > Both S and R have syntactic constructs that can result in
 > shadowing of variables -- substitute cannot understand these
 > and will mess them up.  An alternative is to add bindings for
 > free variables as additional arguments to the function; in
 > this example this produces
 > 
 > > pdf.order(9, 5, pnorm, dnorm)
 > function(x, con = 630,
 >             pfun = function(q, mean = 0, sd = 1) { ... },
 >             dfun = function(x, mean = 0, sd = 1) { ... },
 >             r = 5,
 >             n = 9)
 > {
 >         Fx <- pfun(x)
 >         con * Fx^(r - 1) * (1 - Fx)^(n - r) * dfun(x)
 > }

This suggests that for my glm.nb example the MC solution would
not work and substitute really is necessary.  In that case it is
important that the generated function does not have additional
arguments, even with default values, since the functions
themsleves may be passed on to someone else's language modifiers
(to wit the robust() function, but not always) which make the
strong assumption of what arguments are present.

Never mind, in that case I already have a working solution using
substitute().  Now it seems an R solution is not only possible,
but simpler than that of S.

 > This isn't perfect (real lexical scope is much better) but, in
 > my opinion, it is a better approach than using substitute.

The only reservation I would have is whether the frozen
environment (explicitly in R, as extra arguments with defaults in
S) can get very large.  You would need to use substitute if you
wanted a result with pointers to whatever you used for pfun and
dfun and which changed if those (herer pnorm and dnorm) changed.
Not a good practice, I agree, but a real differnece between the
two approaches.

 > Here is the definition of MC I use to implement this.  It is
 > fairly convoluted, but once you have it and understand
 > conceptually what it does, which is quite simple, you don't
 > need to look at it again.
 > 
 > > MC
 > function(f, env = NULL)
 > {
 >         env <- as.list(env)
 >         if(mode(f) != "function")
 >                 stop(paste("not a function:", f))
 >         if(length(env) > 0 && any(names(env) == ""))
 >                 stop(paste("all arguments are not named:", env))
 >         fargs <- if(length(f) > 1) f[1:(length(f) - 1)] else NULL
 >         fbody <- f[length(f)]
 >         cf <- c(fargs, env, fbody)
 >         mode(cf) <- "function"
 >         return(cf)
 > }

Very illuminating.

 > Just my 2c worth :-)

And well worth both of them, even more.  :-) I think I'll frame
it and put it up on the wall as a reminder.

I'm glad I raised this issue now.  I don't mind being wrong if
the result is a useful clarification for us lesser mortals.  I
just wish it wasn't always so necessary for me to learn
everything out in public like this...

Thanks,
Bill
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._