[Rd] How to deal with package conflicts

Duncan Murdoch murdoch.duncan at gmail.com
Fri Nov 25 17:37:51 CET 2011


On 25/11/2011 10:37 AM, Terry Therneau wrote:
> On Fri, 2011-11-25 at 09:50 -0500, Duncan Murdoch wrote:
> >  I think the general idea in formulas is that it is up to the user to
> >  define the meaning of functions used in them.  Normally the user has
> >  attached the package that is working on the formula, so the package
> >  author can provide useful things like s(), but if a user wanted to
> >  redefine s() to their own function, that should be possible.
> >  Formulas
> >  do have environments attached, so both variables and functions should
> >  be
> >  looked up there.
> >
>
> I don't agree that this is the best way.  A function like coxph could
> easily have in its documentation a list of the "formula specials" that
> it defines internally.  If the user want something of their own they can
> easily use a different word.  In fact, I would strongly recommend that
> they don't use one of these key names.  For things that work across
> mutiple packages like ns(), what user in his right mind would redefine
> it?

Yes, that's what I described in the second part of my answer, and you 
can do it too in coxph.  It requires some work to do special processing 
of symbols in a formula, but it is already being done for + and : and *, 
so doing it as well for some other functions would be reasonable.  If 
you don't mind some programming on the formula object, it's not even 
very hard.

As to a user defining their own ns() function:  that seems like it's not 
something we should disallow, especially if it was done in a context 
where natural splines weren't being used.  It might have nothing to do 
with the ns() function in the splines package, but it might mean 
something to the user in terms of his own data.  The splines package is 
a base package so it's not a great idea to re-use the name, but many 
users would not have splines attached, and wouldn't notice that they had 
just masked the splines::ns function.

>    So I re-raise the question.  Is there a reasonably simple way to make
> the survival ridge() function specific to survival formulas?  It sets up
> structures that have no meaning anywhere else, and its global definition
> stands in the way of other sensible uses.  Having it be not exported +
> obey namespace type sematics would be a plus all around.

Yes, there is a way to do what you want.  Don't export the function from 
the package, but preprocess formulas coming into coxph to substitute 
things that look like calls to ridge() with calls to something local.

For example, this does the substitution.  I haven't checked it much, so 
it might mess up something else (and there might be
more elegant ways to write it, using e.g. rapply).  It is definitely 
slightly more elaborate than it needs to be (no need for the separate 
local function), but that's so you can make the outer function do a bit 
more than the recursive part does.

fixRidge <- function( formula ) {

   recurse <- function( e ) {
     if (length(e) == 1) {
        if (as.character(e) == "ridge") e <- quote(survival:::ridge)
     }  else for (i in seq_along(e))
           e[[i]] <- recurse(e[[i]])
    e
   }

   recurse(formula)
}

This replace calls to ridge in the formula with calls to survival:::ridge.


> Philosophical aside:
>    I have discovered to my dismay that formulas do have environments
> attached, and that variables/functions are looked up there.  This made
> sensible semantics for predict() within a function impossible for some
> of the survival functions, unless I were to change all the routines to a
> model=TRUE default.  (And a change of that magnitude to survival, with
> its long list of dependencies, is not fun to contemplate.  A very quick
> survey reveals several dependent packages will break.) So I don't agree
> nearly so fully with the "should" part of your last sentence.  The out
> of context evaluations allowed by environments are, I find, always
> tricky and often lead to intricate special cases.
>    Thus, moving back and forth between how it seems that a formula should
> work, and how it actually does work, sometimes leaves my head
> spinning.
>

It all comes down to the question:  who owns the name?  Generally the 
caller owns the name.  So you should look it up in the context of the 
caller.  In R, that means you need to carry along the environment of the 
caller.

Duncan Murdoch

> Terry T.
>
>
> Terry Therneau
>



More information about the R-devel mailing list