[R] Pipe operator

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Wed Jan 4 19:59:44 CET 2023


Yes, not every use of a word has the same meaning. The UNIX pipe was in many
ways a very different animal where the PIPE was a very real thing and looked
like a sort of temporary file in the file system with special properties.
Basically it was a fixed-size buffer that effectively was written into by a
process that was paused when the buffer was getting full and allowed to
continue when it was drained by a second process reading from it that also
was similarly managed. It assured many things that a temporary file would
not have supplied including uniqueness and privacy. Later they created a
related animal with persistence called a NAMED PIPE.

So the pipelines we are discussing in R do indeed run very synchronously in
whatever order they need to be run so one finishes producing an output into
an anonymous variable that can then be handed to the next function in the
pipeline. 

If you look at a language like Python or perhaps JavaScript, there are ways
to simulate a relatively asynchronous way to run functions on whatever data
is available from other functions, using ideas like generators and iterators
and more. You can create functions that call other functions just to get one
item such as the next prime number, and do some work and call it again so it
yields just one more value and so on. You can build these in chains so that
lots of functions stay resident in memory and only keep producing data just
in time as needed and perhaps even running on multiple processors in even
more parallelism.

R can possibly add such things and it has elements with things not being
evaluated till needed that can have interesting results and of course it is
possible to spawn additional processes, as with many languages, that are
linked together to run at once, but all such speculation is beyond the
bounds of what operators we call PIPES, such as %>% and |> are doing. It
remains very much syntactic sugar that makes life easier for some and annoys
others.

I note some code I see has people hedging their bets a bit about the missing
first argument. They harmlessly keep the first argument and call it a period
as in:
mutate(mydata, ...) %>%
    filter( ., ...) %>%
    group_by( ., ...) %>%
    summarize( ., ...)


In the above, "..." means fill it in and not an alternate meaning, and the
point is the first argument is a period which is replaced by the
passed-along object but that would have been done without it by default. It
remains a reminder that there still is that first argument and I guess it
could be helpful in some ways too and avoids some potential confusion if
others read your code and look up a man page and understand what the second
and subsequent arguments match up to.


-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Richard O'Keefe
Sent: Wednesday, January 4, 2023 1:56 AM
To: Milan Glacier <news using milanglacier.com>
Cc: R-help Mailing List <r-help using r-project.org>
Subject: Re: [R] Pipe operator

This is both true and misleading.
The shell pipe operation came from functional programming.  In fact the
shell pipe operation is NOT "flip apply", which is what |> is, but it is
functional composition.  That is <in command = command(in) command >out =
let out = command
cmd1 | cmd2 = \x.cmd2(cmd1(x)).

Pragmatically, the Unix shell pipe operator does something very important,
which |> (and even functional composition doesn't in F#):
<in cmd1 | cmd2 > out *interleaves* the computation of cmd1 and cmd2,
streaming the data.  But in R, x |> f() |> g() is by definition g(f(x)), and
if g needs the value of its argument, the *whole* of f(x) is evaluated
before g resumes.  This is much closer to what the pipe syntax in the MS-DOS
shell did, if I recall correctly.



On Wed, 4 Jan 2023 at 17:46, Milan Glacier <news using milanglacier.com> wrote:

> With 50 years of programming experience, just think about how useful 
> pipe operator is in shell scripting. The output of previous call 
> becomes the input of next call... Genious idea from our beloved unix 
> conversion...
>
>
> On 01/03/23 16:48, Sorkin, John wrote:
> >I am trying to understand the reason for existence of the pipe 
> >operator,
> %>%, and when one should use it. It is my understanding that the 
> operator sends the file to the left of the operator to the function 
> immediately to the right of the operator:
> >
> >c(1:10) %>% mean results in a value of 5.5 which is exactly the same 
> >as
> the result one obtains using the mean function directly, viz.
> mean(c(1:10)). What is the reason for having two syntactically 
> different but semantically identical ways to call a function? Is one 
> more efficient than the other? Does one use less memory than the other?
> >
> >P.S. Please forgive what might seem to be a question with an obvious
> answer. I am a programmer dinosaur. I have been programming for more 
> than
> 50 years. When I started programming in the 1960s the only pipe one 
> spoke about was a bong.
> >
> >John
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list