[R] Write a function that allows access to columns of a passed dataframe.

Rui Barradas ruipbarradas at sapo.pt
Tue Dec 6 09:17:04 CET 2016


Hello,

Just to say that I wouldn't write the function as John did. I would get 
rid of all the deparse/substitute stuff and instinctively use a quoted 
argument as a column name. Something like the following.

myfun <- function(frame, var){
	[...]
	col <- frame[, var]  # or frame[[var]]
	[...]
}

myfun(mydf, "age")  # much better, simpler, no promises.

Rui Barradas

Em 05-12-2016 21:49, Bert Gunter escreveu:
> Typo: "lazy evaluation" not "lay evaluation."
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Dec 5, 2016 at 1:46 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>> Sorry, hit "Send" by mistake.
>>
>> Inline.
>>
>>
>>
>> On Mon, Dec 5, 2016 at 1:34 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>>> Inline.
>>>
>>> -- Bert
>>>
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Mon, Dec 5, 2016 at 9:53 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>>>> Hello,
>>>>
>>>> Inline.
>>>>
>>>> Em 05-12-2016 17:09, David Winsemius escreveu:
>>>>>
>>>>>
>>>>>> On Dec 5, 2016, at 7:29 AM, John Sorkin <jsorkin at grecc.umaryland.edu>
>>>>>> wrote:
>>>>>>
>>>>>> Rui,
>>>>>> I appreciate your suggestion, but eliminating the deparse statement does
>>>>>> not solve my problem. Do you have any other suggestions? See code below.
>>>>>> Thank you,
>>>>>> John
>>>>>>
>>>>>>
>>>>>> mydf <-
>>>>>> data.frame(id=c(1,2,3,4,5),sex=c("M","M","M","F","F"),age=c(20,34,43,32,21))
>>>>>> mydf
>>>>>> class(mydf)
>>>>>>
>>>>>>
>>>>>> myfun <- function(frame,var){
>>>>>>    call <- match.call()
>>>>>>    print(call)
>>>>>>
>>>>>>
>>>>>>    indx <- match(c("frame","var"),names(call),nomatch=0)
>>>>>>    print(indx)
>>>>>>    if(indx[1]==0) stop("Function called without sufficient arguments!")
>>>>>>
>>>>>>
>>>>>>    cat("I can get the name of the dataframe as a text string!\n")
>>>>>>    #xx <- deparse(substitute(frame))
>>>>>>    print(xx)
>>>>>>
>>>>>>
>>>>>>    cat("I can get the name of the column as a text string!\n")
>>>>>>    #yy <- deparse(substitute(var))
>>>>>>    print(yy)
>>>>>>
>>>>>>
>>>>>>    # This does not work.
>>>>>>    print(frame[,var])
>>>>>>
>>>>>>
>>>>>>    # This does not work.
>>>>>>    print(frame[,"var"])
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>    # This does not work.
>>>>>>    col <- xx[,"yy"]
>>>>>>
>>>>>>
>>>>>>    # Nor does this work.
>>>>>>    col <- xx[,yy]
>>>>>>    print(col)
>>>>>> }
>>>>>>
>>>>>>
>>>>>> myfun(mydf,age)
>>>>>
>>>>>
>>>>>
>>>>> When you use that calling syntax, the system will supply the values of
>>>>> whatever the `age` variable contains. (And if there is no `age`-named
>>>>> object, you get an error at the time of the call to `myfun`.
>>>>
>>>>
>>>> Actually, no, which was very surprising to me but John's code worked (not
>>>> the function, the call). And with the change I've proposed, it worked
>>>> flawlessly. No errors. Why I don't know.
>>
>> See ?substitute and in particular the example highlighted there.
>>
>> The technical details are explained in the R Language Definition
>> manual. The key here is the use of promises for lay evaluations. In
>> fact, the expression in the call *is* available within the functions,
>> as is (a pointer to) the environment in which to evaluate the
>> expression. That is how substitute() works. Specifically, quoting from
>> the manual,
>>
>> *****
>> It is possible to access the actual (not default) expressions used as
>> arguments inside the function. The mechanism is implemented via
>> promises. When a function is being evaluated the actual expression
>> used as an argument is stored in the promise together with a pointer
>> to the environment the function was called from. When (if) the
>> argument is evaluated the stored expression is evaluated in the
>> environment that the function was called from. Since only a pointer to
>> the environment is used any changes made to that environment will be
>> in effect during this evaluation. The resulting value is then also
>> stored in a separate spot in the promise. Subsequent evaluations
>> retrieve this stored value (a second evaluation is not carried out).
>> Access to the unevaluated expression is also available using
>> substitute.
>> ********
>>
>> -- Bert
>>
>>
>>
>>
>>>>
>>>> Rui Barradas
>>>>
>>>>   You need either to call it as:
>>>>>
>>>>>
>>>>> myfun( mydf , "age")
>>>>>
>>>>>
>>>>> # Or:
>>>>>
>>>>> age <- "age"
>>>>> myfun( mydf, age)
>>>>>
>>>>> Unless your value of the `age`-named variable was "age" in the calling
>>>>> environment (and you did not give us that value in either of your postings),
>>>>> you would fail.
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list