[Rd] should base R have a piping operator ?

Ant F @nto|ne@|@br| @end|ng |rom gm@||@com
Sat Oct 5 20:26:25 CEST 2019


Yes but this exageration precisely misses the point.

Concerning your examples:

* I love fread but I think it makes a lot of subjective choices that are
best associated with a package. I think it
changed a lot with time and can still change, and we have great developers
willing to maintain it and be reactive
regarding feature requests or bug reports

*.group_by() adds a class that works only (or mostly) with tidyverse verbs,
that's very easy to dismiss it as an inclusion in base R.

* summarize is an alternative to aggregate, that would be very confusing to
have both

Now to be fair to your argument we could think of other functions such as
data.table::rleid() which I believe base R misses deeply,
and there is nothing wrong with packaged functions making their way to base
R.

Maybe there's an existing list of criteria for inclusion, in base R but if
not I can make one up for the sake of this discussion :) :
* 1) the functionality should not already exist
* 2) the function should be general enough
* 3) the function should have a large amount of potential of users
* 4) the function should be robust, and not require extensive maintenance
* 5) the function should be stable, we shouldn't expect new features ever 2
months
* 6) the function should have an intuitive interface in the context of the
rest ot base R

I guess 1 and 6 could be held against my proposal, because :
(1) everything can be done without pipes
(6) They are somewhat surprising (though with explicit dots not that much,
and not more surprising than say `bquote()`)

In my opinion the + offset the -.

I wouldn't advise taking magrittr's pipe (providing the license allows so)
for instance, because it makes a lot of design choices and has a complex
behavior, what I propose is 2 lines of code very unlikely to evolve or
require maintenance.

Antoine

PS: I just receive the digest once a day so If you don't "reply all" I can
only react later.

Le sam. 5 oct. 2019 à 19:54, Hugh Marera <hugh.marera using gmail.com> a écrit :

> I exaggerated the comparison for effect. However, it is not very difficult
> to find functions in dplyr or data.table or indeed other packages that one
> may wish to be in base R. Examples, for me, could include
> data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc. Also,
> the "popularity" of magrittr::`%>%` is mostly attributable to the tidyverse
> (an advanced superset of R). Many R users don't even know that they are
> installing the magrittr package.
>
> On Sat, Oct 5, 2019 at 6:30 PM Iñaki Ucar <iucar using fedoraproject.org> wrote:
>
>> On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.marera using gmail.com> wrote:
>> >
>> > How is your argument different to, say,  "Should dplyr or data.table be
>> > part of base R as they are the most popular data science packages and
>> they
>> > are used by a large number of users?"
>>
>> Two packages with many features, dozens of functions and under heavy
>> development to fix bugs, add new features and improve performance, vs.
>> a single operator with a limited and well-defined functionality, and a
>> reference implementation that hasn't changed in years (but certainly
>> hackish in a way that probably could only be improved from R itself).
>>
>> Can't you really spot the difference?
>>
>> Iñaki
>>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list