[Rd] conflicted: an alternative conflict resolution strategy

Duncan Murdoch murdoch@dunc@n @ending from gm@il@com
Fri Aug 24 20:33:41 CEST 2018


On 24/08/2018 3:12 AM, Jari Oksanen wrote:
> If you have to load two packages which both export the same name in 
> their namespaces, namespace does not help in resolving which synonymous 
> function to use. Neither does it help to have a package instead of a 
> script as long as you end up loading two namespaces with name conflicts. 

You can't import the same name from two packages without getting an 
error message (at least when checking --as-cran, I'm not sure about 
vanilla checks), so this is already handled.

If you really only want one of the imports, then importing individual 
functions is the solution.  Don't import everything from the package. 
This is a good idea in any case.

If you want both of the imports, then there's the undocumented (?) 
ability to rename a function on import, as well as the documented 
possibility of using :: for one of them instead of importing it.

> The order of importing namespaces can also be difficult to control, 
> because you may end up loading a namespace already when you start your R 
> with a saved workspace.

That doesn't make sense in the context of a package.  Packages import 
what they ask to import. The user's workspace is irrelevant to code 
within the package if it does its imports properly.  You can reference 
functions that are not imported, but you get a message when you run 
checks to tell you not to do that.

Duncan Murdoch

  Moving a function to another package may be a
> transitional issue which disappears when both packages are at their 
> final stages, but if you use the recommend deprecation stage, the same 
> names can live together for a long time. So this package is a good idea, 
> and preferably base R should be able to handle the issue of choosing 
> between exported synonymous functions.
> 
> This has bitten me several times in package development, and with 
> growing CRAN it is a growing problem. Package authors often have poor 
> control of the issue, as they do not know what packages users use. Now 
> we can only have a FAQ that tells that a certain error message does not 
> come from a function in our package, but from some other package having 
> a synonymous function that was used instead.
> 
> cheers, Jari Oksanen
> 
>> On 23 Aug 2018, at 23:46 pm, Duncan Murdoch <murdoch.duncan using gmail.com 
>> <mailto:murdoch.duncan using gmail.com>> wrote:
>>
>> First, some general comments:
>>
>> This sounds like a useful package.
>>
>> I would guess it has very little impact on runtime efficiency except 
>> when attaching a new package; have you checked that?
>>
>> I am not so sure about your heuristics.  Can they be disabled, so the 
>> user is always forced to make the choice?  Even when a function is 
>> intended to adhere to the superset principle, they don't always get it 
>> right, so a really careful user should always do explicit disambiguation.
>>
>> And of course, if users wrote most of their long scripts as packages 
>> instead of as long scripts, the ambiguity issue would arise far less 
>> often, because namespaces in packages are intended to solve the same 
>> problem as your package does.
>>
>> One more comment inline about a typo, possibly in an error message.
>>
>> Duncan Murdoch
>>
>> On 23/08/2018 2:31 PM, Hadley Wickham wrote:
>>> Hi all,
>>> I’d love to get your feedback on the conflicted package, which 
>>> provides an
>>> alternative strategy for resolving ambiugous function names (i.e. when
>>> multiple packages provide identically named functions). conflicted 0.1.0
>>> is already on CRAN, but I’m currently preparing a revision
>>> (<https://github.com/r-lib/conflicted>), and looking for feedback.
>>> As you are no doubt aware, R’s default approach means that the most
>>> recently loaded package “wins” any conflicts. You do get a message about
>>> conflicts on load, but I see a lot newer R users experiencing problems
>>> caused by function conflicts. I think there are three primary reasons:
>>> -   People don’t read messages about conflicts. Even if you are
>>>     conscientious and do read the messages, it’s hard to notice a single
>>>     new conflict caused by a package upgrade.
>>> -   The warning and the problem may be quite far apart. If you load all
>>>     your packages at the top of the script, it may potentially be 100s
>>>     of lines before you encounter a conflict.
>>> -   The error messages caused by conflicts are cryptic because you end
>>>     up calling a function with utterly unexpected arguments.
>>> For these reasons, conflicted takes an alternative approach, forcing the
>>> user to explicitly disambiguate any conflicts:
>>>     library(conflicted)
>>>     library(dplyr)
>>>     library(MASS)
>>>     select
>>>     #> Error: [conflicted] `select` found in 2 packages.
>>>     #> Either pick the one you want with `::`
>>>     #> * MASS::select
>>>     #> * dplyr::select
>>>     #> Or declare a preference with `conflicted_prefer()`
>>>     #> * conflict_prefer("select", "MASS")
>>>     #> * conflict_prefer("select", "dplyr")
>>
>> I don't know if this is a typo in your r-devel message or a typo in 
>> the error message, but you say `conflicted_prefer()` in one place and 
>> conflict_prefer() in the other.
>>
>>> conflicted works by attaching a new “conflicted” environment just after
>>> the global environment. This environment contains an active binding for
>>> any ambiguous bindings. The conflicted environment also contains
>>> bindings for `library()` and `require()` that rebuild the conflicted
>>> environemnt suppress default reporting (but are otherwise thin wrapeprs
>>> around the base equivalents).
>>> conflicted also provides a `conflict_scout()` helper which you can use
>>> to see what’s going on:
>>>     conflict_scout(c("dplyr", "MASS"))
>>>     #> 1 conflict:
>>>     #> * `select`: dplyr, MASS
>>> conflicted applies a few heuristics to minimise false positives (at the
>>> cost of introducing a few false negatives). The overarching goal is to
>>> ensure that code behaves identically regardless of the order in which
>>> packages are attached.
>>> -   A number of packages provide a function that appears to conflict
>>>     with a function in a base package, but they follow the superset
>>>     principle (i.e. they only extend the API, as explained to me by
>>>     Hervè Pages).
>>>     conflicted assumes that packages adhere to the superset principle,
>>>     which appears to be true in most of the cases that I’ve seen. For
>>>     example, the lubridate package provides `as.difftime()` and `date()`
>>>     which extend the behaviour of base functions, and provides S4
>>>     generics for the set operators.
>>>         conflict_scout(c("lubridate", "base"))
>>>         #> 5 conflicts:
>>>         #> * `as.difftime`: [lubridate]
>>>         #> * `date`       : [lubridate]
>>>         #> * `intersect`  : [lubridate]
>>>         #> * `setdiff`    : [lubridate]
>>>         #> * `union`      : [lubridate]
>>>     There are two popular functions that don’t adhere to this principle:
>>>     `dplyr::filter()` and `dplyr::lag()` :(. conflicted handles these
>>>     special cases so they correctly generate conflicts. (I sure wish I’d
>>>     know about the subset principle when creating dplyr!)
>>>         conflict_scout(c("dplyr", "stats"))
>>>         #> 2 conflicts:
>>>         #> * `filter`: dplyr, stats
>>>         #> * `lag`   : dplyr, stats
>>> -   Deprecated functions should never win a conflict, so conflicted
>>>     checks for use of `.Deprecated()`. This rule is very useful when
>>>     moving functions from one package to another. For example, many
>>>     devtools functions were moved to usethis, and conflicted ensures
>>>     that you always get the non-deprecated version, regardess of package
>>>     attach order:
>>>         head(conflict_scout(c("devtools", "usethis")))
>>>         #> 26 conflicts:
>>>         #> * `use_appveyor`       : [usethis]
>>>         #> * `use_build_ignore`   : [usethis]
>>>         #> * `use_code_of_conduct`: [usethis]
>>>         #> * `use_coverage`       : [usethis]
>>>         #> * `use_cran_badge`     : [usethis]
>>>         #> * `use_cran_comments`  : [usethis]
>>>         #> ...
>>> Finally, as mentioned above, the user can declare preferences:
>>>     conflict_prefer("select", "MASS")
>>>     #> [conflicted] Will prefer MASS::select over any other package
>>>     conflict_scout(c("dplyr", "MASS"))
>>>     #> 1 conflict:
>>>     #> * `select`: [MASS]
>>> I’d love to hear what people think about the general idea, and if there
>>> are any obviously missing pieces.
>>> Thanks!
>>> Hadley
>>>
>>
>> ______________________________________________
>> R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list