[Rd] Pipe operator status, placeholders?

Benjamin Redelings benj@m|n@rede||ng@ @end|ng |rom gm@||@com
Wed Apr 20 05:23:42 CEST 2022

On 4/19/22 8:22 PM, Duncan Murdoch wrote:
> On 19/04/2022 6:55 p.m., Simon Urbanek wrote:
>> Ben,
>> I think you considered only part of Peter's response. Placeholders 
>> can safely only work for the first call, hence at the top level. 
>> Anything below may not do what you think as you'd have to skip frames 
>> and suddenly things can have entirely different meaning since you're 
>> not evaluating in the scope of the preceding call. That is also the 
>> reason why only named arguments are allowed, because if it was not 
>> the case then you might be tempted to write x |> _$foo[1] which looks 
>> legit at a first glance, but is no longer at the top level (since it 
>> is `[`(`$`(_, foo), 1)) and thus not valid.
> The R pipe is purely syntactic sugar, it just transforms expressions.  
> I think the real reason not to allow _ to be deeply nested in an 
> expression is that it would make parsing really hard.  If you have
>  x |> { some really huge expression }
> then the parser would have to parse the huge expression and search it 
> for underscores to see what to do with x.  With the current rule, the 
> search is much easier, it's just at the top level.

As long as the search for placeholders is linear in the size of the 
total expression, I suspect that this would will be quite fast.  It 
might be a bit tricky to make sure that each sub-expression is searched 
only once, if there are nested pipes.

> There are probably cases where deeply nested underscores would be 
> ambiguous, e.g. if that huge expression contained a pipe operator 
> itself, who gets the substitution?

If you have something like x |> f(_, y |> g(_)), then you could make a 
rule, such as: when the top-level pipe searches for a placeholder in its 
RHS, it ignores the RHS of any nested pipe operator that it finds.

Interestingly, this happens automatically.  When the top-level pipe 
operator searches its RHS for the placeholder, it would see `f(_ , 
g(y))`.  Any nested pipe in the RHS would already have consumed its own 
placeholder and been transformed into a pipeless expression.

However, the top-level pipe would search the expression `g(y)` for a 
placeholder, which means that the expression gets searched twice.

> The other limitation of the transformation approach is that _ can only 
> occur once.  magrittr evaluates x and puts the value in where it sees 
> a dot, so this works to print 2 once and give a value of 4:
>   print(2) %>% `+`(., .)
> It's equivalent to
>   *tmp* <- print(2)
>   *tmp* + *tmp*
> However, you'd have the print executed twice in
>   print(2) |> `+`(_, _)
> (if such was allowed), because it would be equivalent to
>   print(2) + print(2)
Yeah, this makes sense.  If you allow the placeholder to occur twice, 
you can't just substitute the expression, because then you could 
evaluate it twice.  Then you have to implement lazy evaluation, which 
the lambda function syntax `x |> (\(d) ...)()` already does.

take care,


> Duncan Murdoch
>> Cheers,
>> Simon
>>> On Apr 20, 2022, at 12:43 AM, Benjamin Redelings 
>>> <benjamin.redelings using gmail.com> wrote:
>>> Thanks to you and Lionel for the info!  I wasn't sure if there was a 
>>> private core developers list, or if I was just looking in the wrong 
>>> place.
>>> 1. Its good to know that the only reason not to allow _ in 
>>> positional arguments is that its easy to miss.  Personally, I would 
>>> like to be able to write f(x, _), but its not a big deal.
>>> Is the idea that when you see
>>>      x |> f(x, y, _, z, w)
>>> its hard for the eye to scan the RHS and find the _?
>>> Hmm.... I notice that a lot of languages (i.e. Haskell) use _ as a 
>>> wildcard pattern, and I don't recall any complaints about it being 
>>> hard to see.
>>> 2. I can see how there would be issues with placeholders that aren't 
>>> at the top level... although I'm not sure precisely what they are.  
>>> Any hints? :-)  I did briefly look at the parser/grammar file...
>>> Thanks again for the info!
>>> -BenRI
>>> On 4/19/22 3:24 AM, peter dalgaard wrote:
>>>> You probably want Luke Tierney for the full story, but what I 
>>>> gather from the deliberations (on the private R-core list), there 
>>>> are issues with how non-funcall syntax like lm(....) |> _$coef[2] 
>>>> should work. This, in turn, has to do with wanting to have the 
>>>> placeholder occur only as a toplevel substitution (i.e. "["("$"(_, 
>>>> coef), 2) is a no-go. And the reason for that has to do with the 
>>>> way the pipe works in the absense of placeholder, e.g. the parser 
>>>> gets confused by
>>>>> x |> f(g(x=_))
>>>> Error in f(x, g(x = "_")) : invalid use of pipe placeholder
>>>> -pd
>>>>> On 17 Apr 2022, at 01:04 , Benjamin Redelings 
>>>>> <benjamin.redelings using gmail.com> wrote:
>>>>> Hi,
>>>>> I see that R 4.2 adds the underscore _ as a placeholder for the 
>>>>> new forward pipe operator |> , but only for named arguments. The 
>>>>> reason why placeholders for position arguments was NOT added isn't 
>>>>> clear to me, so I've been looking for the discussion around the 
>>>>> introduction of the placeholder.
>>>>> By searching subject lines in the r-devel mailing list archive, 
>>>>> I've found
>>>>> https://stat.ethz.ch/pipermail/r-devel/2021-April/080646.html
>>>>> https://stat.ethz.ch/pipermail/r-devel/2021-January/080396.html
>>>>> https://stat.ethz.ch/pipermail/r-devel/2020-December/080173.html 
>>>>> and following messages
>>>>> but not much else.
>>>>> 1. Am I looking in the wrong place?
>>>>> 2. What is the reasoning behind allowing _ as a placeholder only 
>>>>> for named arguments?
>>>>> take care,
>>>>> -BenRI
>>>>> ______________________________________________
>>>>> R-devel using r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list