[Rd] [External] Warning with new placeholder piped to data.frame extractors `[` and `[[`.

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Thu Jul 21 17:25:41 CEST 2022


Hello,

Thanks for the extra explanations. This thread made the behavior of the 
data.frame extraction methods clear and should now end.
The example on extractors matching arguments by position is new to me 
and very useful.

Thanks again to all, in particular to Luke, Toby, Gabriel and Duncan.

Rui Barradas

Às 02:16 de 21/07/2022, luke-tierney using uiowa.edu escreveu:
> On Wed, 20 Jul 2022, Rui Barradas wrote:
> 
>> Hello,
>>
>> I agree with several points you've made.
>>
>> The code of the data.frame methods for `[` and `[[` is already 
>> complicated enough and a revision is probably not worth the effort, 
>> constructs like piping to `[` and `[[` is not furthering the cause of 
>> readability and a new base R dplyr::pull like function would put an 
>> extra development and maintenace burden on the R Core Team, to which 
>> we are in great debt for their excellent and already difficult and 
>> time consuming work developing, maintaining and making R evolve along 
>> the years.
>>
>> My question, if the named argument syntax is mandatory then it should 
>> not throw a warning, seems to have raised a consensus that this use of 
>> the new pipe operator and placeholder should be discouraged (Toby), 
>> considered a bug (Gabriel) or maybe intentional (Duncan). Definitely 
>> an unclear idiom to be avoided and not a priority.
>>
>> I still find it strange but if R is telling the programmer to write 
>> better code then follow the advice.
>>
>> (As a side note, all of the following work as expected:
>>
>> 1:6 |> `[`(x = _, 2)
>> 1:6 |> `[[`(x = _, 2)
> 
> Depends on what you expext. This is probably not what you expect:
> 
>      > `[`(2, x = 1:6)
>      [1]  2 NA NA NA NA NA
> 
> For  performance reasons many primitives were implemented to
> not do argument matching on named arguments but to accept arguments by
> position. This is particularly true for syntactically special
> functions like arithmetic and extraction operators. You can use named
> arguments in these, but the names are ignored by the default methods,
> which just go by position. S3 methods implemented as R functions
> usually will handle the named arguments in the usual way, but can
> choose not to, as the data.frame extraction methods do.
> 
> Arguably the performance issue is now moot as almost all
> performance-critical code will be byte compiled. But adding argument
> matching in all primitives is not something I can see getting high
> priority at the moment.
> 
> As far as I can see, it looks like dropping the warning for a named
> 'x' argument in the S3 extraction methods for data.frame would be
> fairly straightforward and shouldn't cause any disruption. But this
> wouldn't make it into a release until the placeholder is allowed at
> the head of an extraction chain, assuming we go there.
> 
> Best,
> 
> luke
>>
>> matrix(1:6, nrow = 3) |> `[`(x = _, 2, 2)
>> matrix(1:6, nrow = 3) |> `[`(x = _, 2, )
>> matrix(1:6, nrow = 3) |> `[`(x = _, , 2)
>>
>> list(1:6, b = 7:10) |> `[`(x = _, 2)
>> list(1:6, b = 7:10) |> `[[`(x = _, 2)
>> list(1:6, b = 7:10) |> `$`(x = _, 'b')
>>
>> So this is specific to the data.frame methods.)
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Às 23:44 de 18/07/2022, luke-tierney using uiowa.edu escreveu:
>>> On Sat, 16 Jul 2022, Rui Barradas wrote:
>>>
>>>> Hello,
>>>>
>>>> When piping to any of `[.data.frame` or `[[.data.frame`, the 
>>>> placeholder in mandatory.
>>>>
>>>>
>>>> df1 <- data.frame(y = 1:10, f = rep(c("a", "b"), each = 5))
>>>>
>>>> aggregate(y ~ f, df1, mean) |> `[`('y')
>>>> # Error: function '[' not supported in RHS call of a pipe
>>>>
>>>> aggregate(y ~ f, df1, mean) |> `[[`('y')
>>>> # Error: function '[' not supported in RHS call of a pipe
>>>>
>>>>
>>>>
>>>> But if used it throws a warning.
>>>>
>>>>
>>>>
>>>> aggregate(y ~ f, df1, mean) |> `[`(x = _, 'y')
>>>> #  Warning in `[.data.frame`(x = aggregate(y ~ f, df1, mean), "y"): 
>>>> named arguments
>>>> #  other than 'drop' are discouraged
>>>> #    y
>>>> #  1 3
>>>> #  2 8
>>>>
>>>> aggregate(y ~ f, df1, mean) |> `[[`(x = _, 'y')
>>>> #  Warning in `[[.data.frame`(x = aggregate(y ~ f, df1, mean), "y"): 
>>>> named
>>>> #  arguments other than 'exact' are discouraged
>>>> #  [1] 3 8
>>>>
>>>
>>> The pipe syntax requirs that the placeolder be used as a named
>>> argument.  If you do that, then the syntax is legal and parses
>>> successfully.
>>>
>>>> Hasn't this become inconsistent behavior?
>>>> More than merely right, the named argument is mandatory, it 
>>>> shouldn't give warnings.
>>>
>>> Any R function can decide whether it wants to allow explicitly named
>>> arguments.  Disallowing or discouraging using explicitly named
>>> arguments requires some work and is usually not a good idea. In the
>>> case of the data.frame mechods for [ and [[ the decision was made to
>>> discourage using named arguments other than 'exact'. This seems to
>>> have been to allow a more an expedient way to implement these
>>> functions. This could be revisited, but I doubt is is worth the effort.
>>>
>>> For me the main reason for using pipes is to make code more
>>> readable. Using `[` and such constructs is not furthering that
>>> cause. When I use pipes I am almost always using tidyverse
>>> features, so I have dpyr::pull available, which is more readable,
>>> to me at least. Arguably, base R could have a similar function,
>>> but again I doubt this would be a good investment of time.
>>>
>>> An option that we have experimented with is to allow the placeholder
>>> at the head of an extraction chain. This is supported in the
>>> experimental branch at
>>> https://svn.r-project.org/R/branches/R-syntax. So for example:
>>>
>>>      > mtcars |> _$cyl[1]
>>>      [1] 6
>>>
>>> This may make it into R-devel for the next release, but it still needs
>>> more testing.
>>>
>>> Best,
>>>
>>> luke
>>>
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>
>



More information about the R-devel mailing list