[R] eval(parse(text vs. get when accessing a function

Sat Jan 6 15:31:20 CET 2007

I guess the problems with eval(parse(...)) are:

- speed
- verbosity
- security
- length of string to be parsed

get is faster, less verbose and safer than eval(parse(...)) in this case.

The length of string problem seems not applicable here but I have enountered
one situation where it was.  gsubfn in package gsubfn does string replacement.
I found that one user wanted to use it on strings that were ~ 25000 characters
long and parse could not handle that length.  I removed parse replacing it with
other constructs and now they can be handled.

On 1/6/07, Ramon Diaz-Uriarte <rdiaz02 at gmail.com> wrote:
> Dear Greg,
>
>
> On 1/5/07, Greg Snow <Greg.Snow at intermountainmail.org> wrote:
> > Ramon,
> >
> > I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us).
> >
>
>
> Those suggestions do apply to me of course (no claim to being
> organized nor beyond idiocy here). And actually the suggestions on
> this thread are being very useful. I think, though, that I was not
> very clear on the context and my examples were too dumbed down. So
> I'll try to give more detail (nothing here is secret, I am just trying
> not to bore people).
>
> The code is part of a web-based application, so there is no
> interactive user. The R code is passed the arguments (and optional
> user functions) from the CGI.
>
> There is one "core" function (call it cvFunct) that, among other
> things, does cross-validation. So this is one way to do things:
>
> cvFunct <- function(whatever, genefiltertype, whateverelse) {
>      internalGeneSelect <- eval(parse(text = paste("geneSelect",
>                                             genefiltertype, sep = ".")))
>
>      ## do things calling internalGeneSelect,
> }
>
> and now define all possible functions as
>
> geneSelect.Fratio <- function(x, y, z) {##something}
> geneSelect.Wilcoxon <- function(x, y, z) {## something else}
>
> If I want more geneSelect functions, adding them is simple. And I can
> even allow the user to pass her/his own functions, with the only
> restriction that it takes three args, x, y, z, and that the function
> is to be called: "geneSelect." and a user choosen string. (Yes, I need
> to make sure no calls to "system", etc, are in the user code, etc,
> etc, but that is another issue).
>
> The general idea is not new of course. For instance, in package
> "e1071", a somewhat similar thing is done in function "tune", and
> David Meyer there uses "do.call". However, tune is a lot more general
> than what I had in mind. For instance, "tune" deals with arbitrary
> functions, with arbitrary numbers and names of parameters, whereas my
> functions above all take only three arguments (x: a matrix, y: a
> vector; z: an integer), so the neat functionality provided by
> "do.call", and passing the args as a list is not really needed.
>
> So, given that my situation is so structured, and I do not need
> "do.call", I think the approach via eval(parse(paste makes my life
> simple:
>
> a) the central function (cvFunct) uses something I can easily
> recognize: "internalGeneSelect"
>
> b) after the initial eval(parse(text I do not need to worry anymore
> about what the "true" gene selection function is called
>
> c) adding new functions and calling them is simple: function naming
> follows a simple pattern ("geneSelect." + postfix) and calling the
> user function only requires passing the postfix to cvFunct.
>
> d) notice also that, at least the functs. I define, will of course not
> be named "f.1", etc, but rather things like "geneSelect.Fratio" or
> "geneSelect.namesThatStartWithCuteLetters";
>
> I hope this makes things more clear. I did not include this detail
> because this is probably boring (I guess most of you have stopped
> reading by now :-).
>
>
> > Using the list forces you to think about what functions may be called and thinking about things before doing them is usually a good idea.  Personally I don't trust the user of my functions (usually my future self who has forgotten something that seemed obvious at the time) to not do something stupid with them.
> >
> > With list elements you can have names for the functions and access them either by the name or by a number, I find that a lot easier when I go back to edit/update than to remember which function f.1 or f.2 did what.
> >
>
> But I don't see how having your functions as list elements is easier
> (specially if the function is longer than 2 to 3 lines) than having
> all functions systematically named things such as:
>
> geneSelect.Fratio
> geneSelect.Random
> geneSelect.LetterA
> etc
>
> Of course, I could have a list with the components named "Fratio"
> "Random", "LetterA". But I fail to see what it adds. And it forces me
> to build the list, and probably rebuild it whe (or not build it until)
> the user enters her/his own selection function. But the later I do not
> need to do with the scheme above.
>
>
> > With your function, what if the user runs:
> >
> > > g(5,3)
> >
> > What should it do?  (you have only shown definitions for f.1 and f.2).  With my luck I would accidentily type that and just happen to have a f.3 function sitting around from a previous project that does something that I really don't want it to do now.  If I use the list approach then I will get a subscript out of bounds error rather than running something unintended.
> >
> >
>
> I see the general concern, but not how it applies here. If I pass
> argument "Fratio" then either I use geneSelect.Fratio or I get an
> error if "geneSelect.Fratio" does not exist. Similar to what would
> happen if I do
>
> g1(2, 8)
>
> when f.8 is not defined:
>
> Error in eval(expr, envir, enclos) : object "f.8" not found
> So even in more general cases, except for function redefinitions, etc,
> you are not able to call non-existent stuff.
>
> > 2nd, If I used the eval-parse approach then I would probably at some point redefine f.1 or f.2 to the output of a regression analysis or something, then go back and run the g function at a later time and wonder why I am getting an error, then once I have finally figured it out, now I need to remember what f.1 did and rewrite it again.  I am much less likely to accidentally replace an element of a list, and if the list is well named I am unlikely to replace the whole list by accident.
> >
> >
>
> Yes, that is true. Again, it does not apply to the actual case I have
> in mind, but of course, without the detailed info on context I just
> gave, you could not know that.
>
>
> > 3rd, If I ever want to use this code somewhere else (new version of R, on the laptop, give to coworker, ...), it is a lot easier to save and load a single list than to try to think of all the functions that need to be saved.
> >
>
> Oh, sure. But all the functions above live in a single file (actually,
> a minipackage) except for the optional use function (which is read
> from a file).
>
>
> >
> > Personally I have never regretted trying not to underestimate my own future stupidity.
> >
>
> Neither do I. And actually, that is why I asked: if Thomas Lumley
> said, in the fortune, that I better rethink about it, then I should
> try rethinking about it. But I asked because I failed to see what the
> problem is.
>
>
> > Hope this helps,
> >
>
> It certainly does.
>
>
> Best,
>
> R.
>
>
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.snow at intermountainmail.org
> > (801) 408-8111
> >
> >
> >
> > > -----Original Message-----
> > > From: r-help-bounces at stat.math.ethz.ch
> > > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Ramon
> > > Diaz-Uriarte
> > > Sent: Friday, January 05, 2007 11:41 AM
> > > To: Peter Dalgaard
> > > Cc: r-help; rdiaz02 at gmail.com
> > > Subject: Re: [R] eval(parse(text vs. get when accessing a function
> > >
> > > On Friday 05 January 2007 19:21, Peter Dalgaard wrote:
> > > > Ramon Diaz-Uriarte wrote:
> > > > > Dear All,
> > > > >
> > > > > I've read Thomas Lumley's fortune "If the answer is parse() you
> > > > > should usually rethink the question.". But I am not sure it that
> > > > > also applies (and why) to other situations (Lumley's comment
> > > > > http://tolstoy.newcastle.edu.au/R/help/05/02/12204.html
> > > > > was in reply to accessing a list).
> > > > >
> > > > > Suppose I have similarly called functions, except for a
> > > postfix. E.g.
> > > > >
> > > > > f.1 <- function(x) {x + 1}
> > > > > f.2 <- function(x) {x + 2}
> > > > >
> > > > > And sometimes I want to call f.1 and some other times f.2 inside
> > > > > another function. I can either do:
> > > > >
> > > > > g <- function(x, fpost) {
> > > > >     calledf <- eval(parse(text = paste("f.", fpost, sep = "")))
> > > > >     calledf(x)
> > > > >     ## do more stuff
> > > > > }
> > > > >
> > > > >
> > > > > Or:
> > > > >
> > > > > h <- function(x, fpost) {
> > > > >     calledf <- get(paste("f.", fpost, sep = ""))
> > > > >     calledf(x)
> > > > >     ## do more stuff
> > > > > }
> > > > >
> > > > >
> > > > > Two questions:
> > > > > 1) Why is the second better?
> > > > >
> > > > > 2) By changing g or h I could use "do.call" instead; why
> > > would that
> > > > > be better? Because I can handle differences in argument lists?
> > >
> > > Dear Peter,
> > >
> > > Thanks for your answer.
> > >
> > > >
> > > > Who says that they are better?  If the question is how to call a
> > > > function specified by half of its name, the answer could well be to
> > > > use parse(), the point is that you should rethink whether that was
> > > > really the right question.
> > > >
> > > > Why not instead, e.g.
> > > >
> > > > f <- list("1"=function(x) {x + 1} , "2"=function(x) {x + 2}) h <-
> > > > function(x, fpost) f[[fpost]](x)
> > > >
> > > > > h(2,"2")
> > > >
> > > > [1] 4
> > > >
> > > > > h(2,"1")
> > > >
> > > > [1] 3
> > > >
> > >
> > > I see, this is direct way of dealing with the problem.
> > > However, you first need to build the f list, and you might
> > > not know about that ahead of time. For instance, if I build a
> > > function so that the only thing that you need to do to use my
> > > function g is to call your function "f.something", and then
> > > pass the "something".
> > >
> > > I am still under the impression that, given your answer,
> > > using "eval(parse(text" is not your preferred way.  What are
> > > the possible problems (if there are any, that is). I guess I
> > > am puzzled by "rethink whether that was really the right question".
> > >
> > >
> > > Thanks,
> > >
> > > R.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > R.
> > >
> > > --
> > > Ramón Díaz-Uriarte
> > > Centro Nacional de Investigaciones Oncológicas (CNIO)
> > > (Spanish National Cancer Center) Melchor Fernández Almagro, 3
> > > 28029 Madrid (Spain)
> > > Fax: +-34-91-224-6972
> > > Phone: +-34-91-224-6900
> > >
> > > http://ligarto.org/rdiaz
> > > PGP KeyID: 0xE89B3462
> > > (http://ligarto.org/rdiaz/0xE89B3462.asc)
> > >
> > >
> > >
> > > **NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en
> > > s...{{dropped}}
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
>
>
> --
> Ramon Diaz-Uriarte
> Statistical Computing Team
> Structural Biology and Biocomputing Programme
> Spanish National Cancer Centre (CNIO)
> http://ligarto.org/rdiaz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>