[Rd] evaluation in transform versus within

Gabriel Becker gmbecker at ucdavis.edu
Wed Apr 1 20:05:59 CEST 2015


Ah, of course. Embarassing, the "environment" and "new.env" wires got
crossed in my head somehow.

Joris - The take away, as Duncan's point suggests, is that e (which is
where expr is evaluated) is the "environment form" of data. So that's why
the lookup hits things in data before anything else.

Sorry for the unintentional obfuscation.

~G


On Wed, Apr 1, 2015 at 10:55 AM, Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:

> On 01/04/2015 1:35 PM, Gabriel Becker wrote:
>
>> Joris,
>>
>>
>> The second argument to evalq is envir, so that line says, roughly, "call
>> environment() to generate me a new environment within the environment
>> defined by data".
>>
>
> I think that's not quite right.  environment() returns the current
> environment, it doesn't create a new one.  It is evalq() that created a new
> environment from data, and environment() just returns it.
>
> Here's what happens.  I've put the code first, the description of what
> happens on the line below.
>
>     parent <- parent.frame()
>
> Get the environment from which within.data.frame was called.
>
>     e <- evalq(environment(), data, parent)
>
> Create a new environment containing the columns of data, with the parent
> being the environment where we were called.
> Return it and store it in e.
>
>     eval(substitute(expr), e)
>
> Evaluate the expression in this new environment.
>
>     l <- as.list(e)
>
> Convert it to a list.
>
>     l <- l[!vapply(l, is.null, NA, USE.NAMES = FALSE)]
>
> Delete NULL entries from the list.
>
>     nD <- length(del <- setdiff(names(data), (nl <- names(l))))
>
> Find out if any columns were deleted.
>
>     data[nl] <- l
>
> Set the columns of data to the values from the list.
>
>     if (nD)
>         data[del] <- if (nD == 1)
>             NULL
>         else vector("list", nD)
>     data
>
> Delete the columns from data which were deleted from the list.
>
>
>
>> Note that that is is only generating e, the environment that expr will be
>> evaluated within in the next line (the call to eval). This means that expr
>> is evaluated in an environment which is inside the environment defined by
>> data, so you get non-standard evaluation in that symbols defined in data
>> will be available to expr earlier in symbol lookup than those in the
>> environment that within() was called from.
>>
>
> This again sounds like there are two environments created, when really
> there's just one, but the last part is correct.
>
> Duncan Murdoch
>
>
>
>> This is easy to confirm from the behavior of these functions:
>>
>> > df = data.frame(x = 1:10, y = rnorm(10))
>> > x = "I'm a character"
>> > mean(x)
>> [1] NA
>> Warning message:
>> In mean.default(x) : argument is not numeric or logical: returning NA
>> > within(df, mean.x <- mean(x))
>>      x            y mean.x
>> 1   1  0.396758869    5.5
>> 2   2  0.945679050    5.5
>> 3   3  1.980039723    5.5
>> 4   4 -0.187059706    5.5
>> 5   5  0.008220067    5.5
>> 6   6  0.451175885    5.5
>> 7   7 -0.262064017    5.5
>> 8   8 -0.652301191    5.5
>> 9   9  0.673609455    5.5
>> 10 10 -0.075590905    5.5
>> > with(df, mean(x))
>> [1] 5.5
>>
>> P.S. this is probably an r-help question.
>>
>> Best,
>> ~G
>>
>>
>>
>>
>> On Wed, Apr 1, 2015 at 10:21 AM, Joris Meys <jorismeys at gmail.com> wrote:
>>
>> > Dear list members,
>> >
>> > I'm a bit confused about the evaluation of expressions using with() or
>> > within() versus subset() and transform(). I always teach my students to
>> use
>> > with() and within() because of the warning mentioned in the helppages of
>> > subset() and transform(). Both functions use nonstandard evaluation and
>> are
>> > to be used only interactively.
>> >
>> > I've never seen that warning on the help page of with() and within(),
>> so I
>> > assumed both functions can safely be used in functions and packages.
>> I've
>> > now been told that both functions pose the same risk as subset() and
>> > transform().
>> >
>> > Looking at the source code I've noticed the extra step:
>> >
>> > e <- evalq(environment(), data, parent)
>> >
>> > which, at least according to my understanding, should ensure that the
>> > functions follow the standard evaluation rules. Could somebody with more
>> > knowledge than I have shed a bit of light on this issue?
>> >
>> > Thank you
>> > Joris
>> >
>> > --
>> > Joris Meys
>> > Statistical consultant
>> >
>> > Ghent University
>> > Faculty of Bioscience Engineering
>> > Department of Mathematical Modelling, Statistics and Bio-Informatics
>> >
>> > tel :  +32 (0)9 264 61 79
>> > Joris.Meys at Ugent.be
>> > -------------------------------
>> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>
>>
>>
>


-- 
Gabriel Becker, PhD
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.

	[[alternative HTML version deleted]]



More information about the R-devel mailing list