[Rd] New pipe operator

Bravington, Mark (Data61, Hobart) M@rk@Br@v|ngton @end|ng |rom d@t@61@c@|ro@@u
Mon Dec 7 02:22:32 CET 2020


Seems like this *could* be a good thing, and thanks to R core for considering it. But, FWIW:

 - I agree with Gabor G that consistency of "syntax" should be paramount here. Enough problems have been caused by earlier superficially-convenient non-standard features in R.  In particular:

 -- there should not be any discrepancy between an in-place function-definition, and a predefined function attached to a symbol (as per Gabor's point). 
 
 -- Hence, the ability to say x |> foo  ie without parentheses, seems bound to lead to inconsistency, because x |> foo is allowed, x |> base::foo isn't allowed without tricks, but x |> function( y) foo( y) isn't... So, x |> foo is not worth keeping. Parentheses are a price well worth paying.
 
 -- it is still inconsistent and confusing to (apparently) invoke a function in some places--- normally--- via 'foo(x)', yet in others--- pipily--- via 'foo()'. Especially if 'foo' already has a default value for its first argument.

 - I don't see the problem with a placeholder--- doesn't it remove all ambiguity? Sure there needs to be a standard unclashable name and people can argue about what that should be, but the following seems clear and flexible... to me, anyway:
 
 thing |> 
   foo( _PIPE_) |>           # standard
   bah( arg1, _PIPE_) |>   # multi-arg function
   _ANON_({ x <- sum( _PIPE_); _PIPE_/x + x/_PIPE_ })   # anon function
  
where '_PIPE_' is the ordained name of the placeholder, and '_ANON_' constructs-and-calls a function with single argument '_PIPE_'. There is just one rule (I think...): each pipe-stage must be a *call* involving the argument '_PIPE_'.


 - The proposed anonymous-function syntax looks quite ugly to me, diminishing readability and inviting errors. The new pipe symbol |> already looks scarily like quantum mechanics; adding \( just puts fishbones into the symbolic soup.

 - IMO it's not worth going too far to try to lure magritter-etc fans to swap to the new; my experience is that many people keep using older inferior R syntax for years after better replacements become available (even if they are aware of replacements), for various reasons. Just provide a good framework, and let nature take its course.
 
 - Disclaimer: personally I'm not much of a pipehead anyway, so maybe I'm not the audience. But if I was to consider piping, I wouldn't be very tempted by the current proposal. OTOH, I might even be tempted to write--- and use!--- my own version of '%|>%' as above (maybe someone already has). And if R did it for me, that'd be great :)
 
[*] Definition of _ANON_ could be something like this--- almost certainly won't work as-is, this is just to point out that it could be done in standard R.

`_ANON_` <- function( expr) { 
  #1. Construct a function with arg '_PIPE_' and body 'expr'
  #2. Construct a call() to that function
  #3. Do the call

  f <- function( `_PIPE_`) NULL
  body( f) <- expr
  environment( f) <- parent.frame() # or something... yes these details are almost certainly wrong
  expr2 <- substitute( f( `_PIPE_`)) # or something...
  eval.parent( expr2) # or something... 
}

cheers
Mark

Mark Bravington
CSIRO Marine Lab
Hobart
Australia


________________________________________
From: R-devel <r-devel-bounces using r-project.org> on behalf of Gabor Grothendieck <ggrothendieck using gmail.com>
Sent: Monday, 7 December 2020 10:21
To: Gabriel Becker
Cc: r-devel using r-project.org
Subject: Re: [Rd] New pipe operator

I understand very well that it is implemented at the syntax level;
however, in any case the implementation is irrelevant to the principles.

Here a similar example to the one I gave before but this time written out:

This works:

  3 |> function(x) x + 1

but this does not:

  foo <- function(x) x + 1
  3 |> foo

so it breaks the principle of functions being first class objects.  foo and its
definition are not interchangeable.  You have
to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().

This isn't just a matter of notation, i.e. foo vs foo(), but is a
matter of breaking
the way R works as a functional language with first class functions.

On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker <gabembecker using gmail.com> wrote:
>
> Hi Gabor,
>
> On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck <ggrothendieck using gmail.com> wrote:
>>
>> I think the real issue here is that functions are supposed to be
>> first class objects in R
>> or are supposed to be and |> would break that if if is possible
>> to write function(x) x + 1 on the RHS but not foo (assuming foo
>> was defined as that function).
>>
>> I don't think getting experience with using it can change that
>> inconsistency which seems serious to me and needs to
>> be addressed even if it complicates the implementation
>> since it drives to the heart of what R is.
>>
>
> With respect I think this is a misunderstanding of what is happening here.
>
> Functions are first class citizens. |> is, for all intents and purposes, a macro.
>
> LHS |> RHS(arg2=5)
>
> parses to
>
> RHS(LHS, arg2 = 5)
>
> There are no functions at the point in time when the pipe transformation happens, because no code has been evaluated. To know if a symbol is going to evaluate to a function requires evaluation which is a step entirely after the one where the |> pipe is implemented.
>
> Another way to think about it is that
>
> LHS |> RHS(arg2 = 5)
>
> is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or even can be) evaluated.
>
>
> Now this is a subtle point that only really has implications in as much as it is not the case for magrittr pipes, but its relevant for discussions like this, I think.
>
> ~G
>
>> On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
>> <ggrothendieck using gmail.com> wrote:
>> >
>> > The construct utils::head  is not that common but bare functions are
>> > very common and to make it harder to use the common case so that
>> > the uncommon case is slightly easier is not desirable.
>> >
>> > Also it is trivial to write this which does work:
>> >
>> > mtcars %>% (utils::head)
>> >
>> > On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage <hugh.parsonage using gmail.com> wrote:
>> > >
>> > > I'm surprised by the aversion to
>> > >
>> > > mtcars |> nrow
>> > >
>> > > over
>> > >
>> > > mtcars |> nrow()
>> > >
>> > > and I think the decision to disallow the former should be
>> > > reconsidered.  The pipe operator is only going to be used when the rhs
>> > > is a function, so there is no ambiguity with omitting the parentheses.
>> > > If it's disallowed, it becomes inconsistent with other treatments like
>> > > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
>> > > noise.  I'm not sure why this decision was taken
>> > >
>> > > If the only issue is with the double (and triple) colon operator, then
>> > > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
>> > > -- in other words, demote the precedence of |>
>> > >
>> > > Obviously (looking at the R-Syntax branch) this decision was
>> > > considered, put into place, then dropped, but I can't see why
>> > > precisely.
>> > >
>> > > Best,
>> > >
>> > >
>> > > Hugh.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar <deepayan.sarkar using gmail.com> wrote:
>> > > >
>> > > > On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>> > > > >
>> > > > > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
>> > > > > >>   Error: function '::' not supported in RHS call of a pipe
>> > > > > >
>> > > > > > To me, this error looks much more friendly than magrittr's error.
>> > > > > > Some of them got too used to specify functions without (). This
>> > > > > > is OK until they use `::`, but when they need to use it, it takes
>> > > > > > hours to figure out why
>> > > > > >
>> > > > > > mtcars %>% base::head
>> > > > > > #> Error in .::base : unused argument (head)
>> > > > > >
>> > > > > > won't work but
>> > > > > >
>> > > > > > mtcars %>% head
>> > > > > >
>> > > > > > works. I think this is a too harsh lesson for ordinary R users to
>> > > > > > learn `::` is a function. I've been wanting for magrittr to drop the
>> > > > > > support for a function name without () to avoid this confusion,
>> > > > > > so I would very much welcome the new pipe operator's behavior.
>> > > > > > Thank you all the developers who implemented this!
>> > > > >
>> > > > > I agree, it's an improvement on the corresponding magrittr error.
>> > > > >
>> > > > > I think the semantics of not evaluating the RHS, but treating the pipe
>> > > > > as purely syntactical is a good decision.
>> > > > >
>> > > > > I'm not sure I like the recommended way to pipe into a particular argument:
>> > > > >
>> > > > >    mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
>> > > > >
>> > > > > or
>> > > > >
>> > > > >    mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
>> > > > >
>> > > > > both of which are equivalent to
>> > > > >
>> > > > >    mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
>> > > > >
>> > > > > It's tempting to suggest it should allow something like
>> > > > >
>> > > > >    mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
>> > > >
>> > > > Which is really not that far off from
>> > > >
>> > > > mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
>> > > >
>> > > > once you get used to it.
>> > > >
>> > > > One consequence of the implementation is that it's not clear how
>> > > > multiple occurrences of the placeholder would be interpreted. With
>> > > > magrittr,
>> > > >
>> > > > sort(runif(10)) %>% ecdf(.)(.)
>> > > > ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
>> > > >
>> > > > This is probably what you would expect, if you expect it to work at all, and not
>> > > >
>> > > > ecdf(sort(runif(10)))(sort(runif(10)))
>> > > >
>> > > > There would be no such ambiguity with anonymous functions
>> > > >
>> > > > sort(runif(10)) |> \(.) ecdf(.)(.)
>> > > >
>> > > > -Deepayan
>> > > >
>> > > > > which would be expanded to something equivalent to the other versions:
>> > > > > but that makes it quite a bit more complicated.  (Maybe _ or \. should
>> > > > > be used instead of ., since those are not legal variable names.)
>> > > > >
>> > > > > I don't think there should be an attempt to copy magrittr's special
>> > > > > casing of how . is used in determining whether to also include the
>> > > > > previous value as first argument.
>> > > > >
>> > > > > Duncan Murdoch
>> > > > >
>> > > > >
>> > > > > >
>> > > > > > Best,
>> > > > > > Hiroaki Yutani
>> > > > > >
>> > > > > > 2020年12月4日(金) 20:51 Duncan Murdoch <murdoch.duncan using gmail.com>:
>> > > > > >>
>> > > > > >> Just saw this on the R-devel news:
>> > > > > >>
>> > > > > >>
>> > > > > >> R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
>> > > > > >> notation for creating functions, e.g. ‘\(x) x + 1’ is parsed as
>> > > > > >> ‘function(x) x + 1’. The pipe implementation as a syntax transformation
>> > > > > >> was motivated by suggestions from Jim Hester and Lionel Henry. These
>> > > > > >> features are experimental and may change prior to release.
>> > > > > >>
>> > > > > >>
>> > > > > >> This is a good addition; by using "|>" instead of "%>%" there should be
>> > > > > >> a chance to get operator precedence right.  That said, the ?Syntax help
>> > > > > >> topic hasn't been updated, so I'm not sure where it fits in.
>> > > > > >>
>> > > > > >> There are some choices that take a little getting used to:
>> > > > > >>
>> > > > > >>   > mtcars |> head
>> > > > > >> Error: The pipe operator requires a function call or an anonymous
>> > > > > >> function expression as RHS
>> > > > > >>
>> > > > > >> (I need to say mtcars |> head() instead.)  This sometimes leads to error
>> > > > > >> messages that are somewhat confusing:
>> > > > > >>
>> > > > > >>   > mtcars |> magrittr::debug_pipe |> head
>> > > > > >> Error: function '::' not supported in RHS call of a pipe
>> > > > > >>
>> > > > > >> but
>> > > > > >>
>> > > > > >> mtcars |> magrittr::debug_pipe() |> head()
>> > > > > >>
>> > > > > >> works.
>> > > > > >>
>> > > > > >> Overall, I think this is a great addition, though it's going to be
>> > > > > >> disruptive for a while.
>> > > > > >>
>> > > > > >> Duncan Murdoch
>> > > > > >>
>> > > > > >> ______________________________________________
>> > > > > >> R-devel using r-project.org mailing list
>> > > > > >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> > > > > >
>> > > > > > ______________________________________________
>> > > > > > R-devel using r-project.org mailing list
>> > > > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > > > > >
>> > > > >
>> > > > > ______________________________________________
>> > > > > R-devel using r-project.org mailing list
>> > > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > > >
>> > > > ______________________________________________
>> > > > R-devel using r-project.org mailing list
>> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > >
>> > > ______________________________________________
>> > > R-devel using r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> >
>> >
>> > --
>> > Statistics & Software Consulting
>> > GKX Group, GKX Associates Inc.
>> > tel: 1-877-GKX-GROUP
>> > email: ggrothendieck at gmail.com
>>
>>
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
R-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list