[Rd] Question about match.fun()

Thu May 11 15:18:45 CEST 2006

On Tue, 9 May 2006, Berwin A Turlach wrote:

> Dear all,
>
> I was recently contacted by a user about an alledged problem/bug in
> the latest version of lasso2.  After some investigation, we found out
> that it was a user error which boils down to the following:
>
>> x <- matrix(rnorm(200), ncol=2)
>> var <- "fred"
>> apply(x, 2, var)
> Error in get(x, envir, mode, inherits) : variable "fred" of mode "function" was not found
>
> only that the "offending" apply() command happened inside the gl1ce()
> function of lasso2.
>
> I was under the impression that R can now distinguish between
> variables and functions with the same name and, indeed, the following
> works:
>
>> var <- 2
>> apply(x, 2, var)
> [1] 1.053002 1.250875
>
> Poking a bit around, I guess that the ability to distinguish between
> variables and functions with the same name comes from the introduction
> of the function match.fun() and, after reading its help page, the

No, not really.  It comes in general from the internal C functions knowing 
from the context what they are looking for from the parse context.

> reasons why an error is triggered the first time but not the second
> time is perfectly clear to me.
>
> I wonder whether it would make sense to change match.fun() such that
> the first case does not result in an error?  I was thinking along the
> line that if the argument to match.fun() is a variable that contains a
> character vector of length one then, using get(), match.fun() attemps
> to find a function with that name.  If the get() command does not
> succeed, then a second try is made using the name of the variable
> passed by the caller to match.fun().

This is tricky, and indeed the bit that appears strange to me is that
the second works.  It comes from

r5628 | pd | 1999-08-26 14:31:42 +0100 (Thu, 26 Aug 1999) | 2 lines
match.fun fixes

which is not very informative, but I found e.g.

http://tolstoy.newcastle.edu.au/R/help/99b/0254.html
PD> This also applies (!) to various other places that need to deal
PD> with FUN arguments (apply, sapply, sweep, outer). It might be
PD> preferable to make match.fun smarter, at the expense of making it
PD> completely obscure.

(and I think we succeeded!)  The essence of that example would appear to 
be

xlev <-list(a=1:7, length=NULL)
sapply(xlev, is.null)

which failed long, long ago.

Note that ?apply (and so on) in 2.3.0 only mention the possibility of 
supplying a function or a character string.  Even the latter seems 
unnecessary these days now we have backquotes:

x <- matrix(runif(20), 10, 2)
apply(x, 2, `+`, 7)

I spent some time recently tidying up the family functions in R-devel. 
There the issues are similar but complicated by the fact that in 
binomial(probit) there is no object `probit'.

I've come to the view that we are trying too hard in many of these cases.
So I would like to see arguments why we need to allow more than

         function
         symbol
         length-one character vector.

and I don't see it lessens the confusion to allow the name of a length-one 
character vector to mean either the value of the first element of the 
object or a symbol if the value is not visible as a function.

> So before trying to modify match.fun() and submitting a patch, I
> wanted to ask whether such a change would be accepted?  Is there an
> argument that I don't see that the first case should always result in
> an error and not be silently resolved?

The main argument is that it is not as documented, and confusing.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595