[R] eval(parse(text vs. get when accessing a function

Mon Jan 8 00:40:22 CET 2007

The S4 is not essential.  You could do it in S3 too:

> f.a <- function(x) x+1
> f.b <- function(x) x+2
> f <- function(x) UseMethod("f")
>
> f(structure(10, class = "a"))
[1] 11
attr(,"class")
[1] "a"

On 1/6/07, Ramon Diaz-Uriarte <rdiaz02 at gmail.com> wrote:
> Hi Martin,
>
>
>
> On 1/6/07, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> > Hi Ramon,
> >
> > It seems like a naming convention (f.xxx) and eval(parse(...)) are
> > standing in for objects (of class 'GeneSelector', say, representing a
> > function with a particular form and doing a particular operation) and
> > dispatch (a function 'geneConverter' might handle a converter of class
> > 'GeneSelector' one way, user supplied ad-hoc functions more carefully;
> > inside geneConverter the only real concern is that the converter
> > argument is in fact a callable function).
> >
> > eval(parse(...)) brings scoping rules to the fore as an explicit
> > programming concern; here scope is implicit, but that's probably better
> > -- R will get its own rules right.
> >
> > Martin
> >
> > Here's an S4 sketch:
> >
> > setClass("GeneSelector",
> >          contains="function",
> >          representation=representation(description="character"),
> >          validity=function(object) {
> >              msg <- NULL
> >              argNames <- names(formals(object))
> >              if (argNames[1]!="x")
> >                msg <- c(msg, "\n  GeneSelector requires a first argument named 'x'")
> >              if (!"..." %in% argNames)
> >                msg <- c(msg, "\n  GeneSelector requires '...' in its signature")
> >              if (0==length(object at description))
> >                msg <- c(msg, "\n  Please describe your GeneSelector")
> >              if (is.null(msg)) TRUE else msg
> >          })
> >
> > setGeneric("geneConverter",
> >            function(converter, x, ...) standardGeneric("geneConverter"),
> >            signature=c("converter"))
> >
> > setMethod("geneConverter",
> >           signature(converter="GeneSelector"),
> >           function(converter, x, ...) {
> >               ## important stuff here
> >               converter(x, ...)
> >           })
> >
> > setMethod("geneConverter",
> >           signature(converter="function"),
> >           function(converter, x, ...) {
> >               message("ad-hoc converter; hope it works!")
> >               converter(x, ...)
> >           })
> >
> > and then...
> >
> > > c1 <- new("GeneSelector",
> > +           function(x, ...) prod(x, ...),
> > +           description="Product of x")
> > >
> > > c2 <- new("GeneSelector",
> > +           function(x, ...) sum(x, ...),
> > +           description="Sum of x")
> > >
> > > geneConverter(c1, 1:4)
> > [1] 24
> > > geneConverter(c2, 1:4)
> > [1] 10
> > > geneConverter(mean, 1:4)
> > ad-hoc converter; hope it works!
> > [1] 2.5
> > >
> > > cvterr <- new("GeneSelector", function(y) {})
> > Error in validObject(.Object) : invalid class "GeneSelector" object: 1:
> >   GeneSelector requires a first argument named 'x'
> > invalid class "GeneSelector" object: 2:
> >   GeneSelector requires '...' in its signature
> > invalid class "GeneSelector" object: 3:
> >   Please describe your GeneSelector
> > > xxx <- 10
> > > geneConverter(xxx, 1:4)
> > Error in function (classes, fdef, mtable)  :
> >         unable to find an inherited method for function "geneConverter", for signature "numeric"
> >
>
>
>
> Thanks!! That is actually a rather interesting alternative approach
> and I can see it also adds a lot of structure to the problem. I have
> to confess, though, that I am not a fan of OOP (nor of S4 classes); in
> this case, in particular, it seems there is a lot of scaffolding in
> the code above (the counterpoint to the structure?) and, regarding
> scoping rules, I prefer to think about them explicitly (I find it much
> simpler than inheritance).
>
> Best,
>
> R.
>
>
> >
> > "Ramon Diaz-Uriarte" <rdiaz02 at gmail.com> writes:
> >
> > > Dear Greg,
> > >
> > >
> > > On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
> > >> Ramon,
> > >>
> > >> I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
> > >>
> > >
> > >
> > > Those suggestions do apply to me of course (no claim to being
> > > organized nor beyond idiocy here). And actually the suggestions on
> > > this thread are being very useful. I think, though, that I was not
> > > very clear on the context and my examples were too dumbed down. So
> > > I'll try to give more detail (nothing here is secret, I am just trying
> > > not to bore people).
> > >
> > > The code is part of a web-based application, so there is no
> > > interactive user. The R code is passed the arguments (and optional
> > > user functions) from the CGI.
> > >
> > > There is one "core" function (call it cvFunct) that, among other
> > > things, does cross-validation. So this is one way to do things:
> > >
> > > cvFunct <- function(whatever, genefiltertype, whateverelse) {
> > >       internalGeneSelect <- eval(parse(text = paste("geneSelect",
> > >                                              genefiltertype, sep = ".")))
> > >
> > >       ## do things calling internalGeneSelect,
> > > }
> > >
> > > and now define all possible functions as
> > >
> > > geneSelect.Fratio <- function(x, y, z) {##something}
> > > geneSelect.Wilcoxon <- function(x, y, z) {## something else}
> > >
> > > If I want more geneSelect functions, adding them is simple. And I can
> > > even allow the user to pass her/his own functions, with the only
> > > restriction that it takes three args, x, y, z, and that the function
> > > is to be called: "geneSelect." and a user choosen string. (Yes, I need
> > > to make sure no calls to "system", etc, are in the user code, etc,
> > > etc, but that is another issue).
> > >
> > > The general idea is not new of course. For instance, in package
> > > "e1071", a somewhat similar thing is done in function "tune", and
> > > David Meyer there uses "do.call". However, tune is a lot more general
> > > than what I had in mind. For instance, "tune" deals with arbitrary
> > > functions, with arbitrary numbers and names of parameters, whereas my
> > > functions above all take only three arguments (x: a matrix, y: a
> > > vector; z: an integer), so the neat functionality provided by
> > > "do.call", and passing the args as a list is not really needed.
> > >
> > > So, given that my situation is so structured, and I do not need
> > > "do.call", I think the approach via eval(parse(paste makes my life
> > > simple:
> > >
> > > a) the central function (cvFunct) uses something I can easily
> > > recognize: "internalGeneSelect"
> > >
> > > b) after the initial eval(parse(text I do not need to worry anymore
> > > about what the "true" gene selection function is called
> > >
> > > c) adding new functions and calling them is simple: function naming
> > > follows a simple pattern ("geneSelect." + postfix) and calling the
> > > user function only requires passing the postfix to cvFunct.
> > >
> > > d) notice also that, at least the functs. I define, will of course not
> > > be named "f.1", etc, but rather things like "geneSelect.Fratio" or
> > > "geneSelect.namesThatStartWithCuteLetters";
> > >
> > > I hope this makes things more clear. I did not include this detail
> > > because this is probably boring (I guess most of you have stopped
> > > reading by now :-).
> > >
> > >
> > >> Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea.  Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.
> > >>
> > >> With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
> > >>
> > >
> > > But I don't see how having your functions as list elements is easier
> > > (specially if the function is longer than 2 to 3 lines) than having
> > > all functions systematically named things such as:
> > >
> > > geneSelect.Fratio
> > > geneSelect.Random
> > > geneSelect.LetterA
> > > etc
> > >
> > > Of course, I could have a list with the components named "Fratio"
> > > "Random", "LetterA". But I fail to see what it adds. And it forces me
> > > to build the list, and probably rebuild it whe (or not build it until)
> > > the user enters her/his own selection function. But the later I do not
> > > need to do with the scheme above.
> > >
> > >
> > >> With your function, what if the user runs:
> > >>
> > >> > g(5,3)
> > >>
> > >> What should it do?  (you have only shown definitions for f.1 and f.2).  With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now.  If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
> > >>
> > >>
> > >
> > > I see the general concern, but not how it applies here. If I pass
> > > argument "Fratio" then either I use geneSelect.Fratio or I get an
> > > error if "geneSelect.Fratio" does not exist. Similar to what would
> > > happen if I do
> > >
> > > g1(2, 8)
> > >
> > > when f.8 is not defined:
> > >
> > > Error in eval(expr, envir, enclos) : object "f.8" not found
> > > So even in more general cases, except for function redefinitions, etc,
> > > you are not able to call non-existent stuff.
> > >
> > >> 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again.  I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
> > >>
> > >>
> > >
> > > Yes, that is true. Again, it does not apply to the actual case I have
> > > in mind, but of course, without the detailed info on context I just
> > > gave, you could not know that.
> > >
> > >
> > >> 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
> > >>
> > >
> > > Oh, sure. But all the functions above live in a single file (actually,
> > > a minipackage) except for the optional use function (which is read
> > > from a file).
> > >
> > >
> > >>
> > >> Personally I have never regretted trying not to underestimate my own future stupidity.
> > >>
> > >
> > > Neither do I. And actually, that is why I asked: if Thomas Lumley
> > > said, in the fortune, that I better rethink about it, then I should
> > > try rethinking about it. But I asked because I failed to see what the
> > > problem is.
> > >
> > >
> > >> Hope this helps,
> > >>
> > >
> > > It certainly does.
> > >
> > >
> > > Best,
> > >
> > > R.
> > >
> > >
> > >> --
> > >> Gregory (Greg) L. Snow Ph.D.
> > >> Statistical Data Center
> > >> Intermountain Healthcare
> > >> greg.snow at intermountainmail.org
> > >> (801) 408-8111
> > >>
> > >>
> > >>
> > >> > -----Original Message-----
> > >> > From: r-help-bounces at stat.math.ethz.ch
> > >> > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon
> > >> > Diaz-Uriarte
> > >> > Sent: Friday, January 05, 2007 11:41 AM
> > >> > To: Peter Dalgaard
> > >> > Cc: r-help; rdiaz02 at gmail.com
> > >> > Subject: Re: [R] eval(parse(text vs. get when accessing a function
> > >> >
> > >> > On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
> > >> > > Ramon Diaz-Uriarte wrote:
> > >> > > > Dear All,
> > >> > > >
> > >> > > > I've read Thomas Lumley's fortune "If the answer is parse() you
> > >> > > > should usually rethink the question.". But I am not sure it that
> > >> > > > also applies (and why) to other situations (Lumley's comment
> > >> > > > http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
> > >> > > > was in reply to accessing a list).
> > >> > > >
> > >> > > > Suppose I have similarly called functions, except for a
> > >> > postfix. E.g.
> > >> > > >
> > >> > > > f.1 <- function(x) {x + 1}
> > >> > > > f.2 <- function(x) {x + 2}
> > >> > > >
> > >> > > > And sometimes I want to call f.1 and some other times f.2 inside
> > >> > > > another function. I can either do:
> > >> > > >
> > >> > > > g <- function(x, fpost) {
> > >> > > >     calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
> > >> > > >     calledf(x)
> > >> > > >     ## do more stuff
> > >> > > > }
> > >> > > >
> > >> > > >
> > >> > > > Or:
> > >> > > >
> > >> > > > h <- function(x, fpost) {
> > >> > > >     calledf <- get(paste("f.", fpost, sep = ""))
> > >> > > >     calledf(x)
> > >> > > >     ## do more stuff
> > >> > > > }
> > >> > > >
> > >> > > >
> > >> > > > Two questions:
> > >> > > > 1) Why is the second better?
> > >> > > >
> > >> > > > 2) By changing g or h I could use "do.call" instead; why
> > >> > would that
> > >> > > > be better? Because I can handle differences in argument lists?
> > >> >
> > >> > Dear Peter,
> > >> >
> > >> > Thanks for your answer.
> > >> >
> > >> > >
> > >> > > Who says that they are better?  If the question is how to call a
> > >> > > function specified by half of its name, the answer could well be to
> > >> > > use parse(), the point is that you should rethink whether that was
> > >> > > really the right question.
> > >> > >
> > >> > > Why not instead, e.g.
> > >> > >
> > >> > > f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
> > >> > > function(x, fpost) f[[fpost]](x)
> > >> > >
> > >> > > > h(2,"2")
> > >> > >
> > >> > > [1] 4
> > >> > >
> > >> > > > h(2,"1")
> > >> > >
> > >> > > [1] 3
> > >> > >
> > >> >
> > >> > I see, this is direct way of dealing with the problem.
> > >> > However, you first need to build the f list, and you might
> > >> > not know about that ahead of time. For instance, if I build a
> > >> > function so that the only thing that you need to do to use my
> > >> > function g is to call your function "f.something", and then
> > >> > pass the "something".
> > >> >
> > >> > I am still under the impression that, given your answer,
> > >> > using "eval(parse(text" is not your preferred way.  What are
> > >> > the possible problems (if there are any, that is). I guess I
> > >> > am puzzled by "rethink whether that was really the right question".
> > >> >
> > >> >
> > >> > Thanks,
> > >> >
> > >> > R.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > > > Thanks,
> > >> > > >
> > >> > > >
> > >> > > > R.
> > >> >
> > >> > --
> > >> > Ram�n D�az-Uriarte
> > >> > Centro Nacional de Investigaciones Oncol�gicas (CNIO)
> > >> > (Spanish National Cancer Center) Melchor Fern�ndez Almagro, 3
> > >> > 28029 Madrid (Spain)
> > >> > Fax: +-34-91-224-6972
> > >> > Phone: +-34-91-224-6900
> > >> >
> > >> > http://ligarto.org/rdiaz
> > >> > PGP KeyID: 0xE89B3462
> > >> > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> > >> >
> > >> >
> > >> >
> > >> > **NOTA DE CONFIDENCIALIDAD** Este correo electr�nico, y en
> > >> > s...{{dropped}}
> > >> >
> > >> > ______________________________________________
> > >> > R-help at stat.math.ethz.ch mailing list
> > >> > https://stat.ethz.ch/mailman/listinfo/r-help
> > >> > PLEASE do read the posting guide
> > >> > http://www.R-project.org/posting-guide.html
> > >> > and provide commented, minimal, self-contained, reproducible code.
> > >> >
> > >>
> > >>
> > >
> > >
> > > --
> > > Ramon Diaz-Uriarte
> > > Statistical Computing Team
> > > Structural Biology and Biocomputing Programme
> > > Spanish National Cancer Centre (CNIO)
> > > http://ligarto.org/rdiaz
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Martin T. Morgan
> > Bioconductor / Computational Biology
> > http://bioconductor.org
> >
>
>
> --
> Ramon Diaz-Uriarte
> Statistical Computing Team
> Structural Biology and Biocomputing Programme
> Spanish National Cancer Centre (CNIO)
> http://ligarto.org/rdiaz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>