[Rd] Error "promise already under evaluation ..." with function(x, dim=dim(x))

Henrik Bengtsson hb at biostat.ucsf.edu
Mon Nov 24 01:07:44 CET 2014


On Sat, Nov 15, 2014 at 1:47 AM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
>
> On 14/11/2014, 9:06 PM, Henrik Bengtsson wrote:
> > I've meant to ask the following for several years now.  I understand why:
> >
> >> foo <- function(x, dim=dim) { dim }
> >> foo(1)
> > Error in foo(1) :
> >   promise already under evaluation: recursive default argument
> > reference or earlier problems?
> >
> > gives an error, but why wouldn't/couldn't the following work?
> >
> >> foo <- function(x, dim=dim(x)) { dim }
> >> foo(1)
> > Error in foo(1) :
> >   promise already under evaluation: recursive default argument
> > reference or earlier problems?
>
> You refer to "dim".  There's a dim defined in the argument list, so R
> uses that definition of it.
>
> But you didn't supply any value, so it tries to evaluate the default
> value.  Default expressions are always evaluated in the evaluation frame
> of the function call, so it looks for a function named "dim" in the
> local frame.
>
> It finds the argument in the local frame, so it tries to figure out if
> it is a function or a value.  It needs to evaluate it to do that, and
> you get the recursion.
>
> >
> > As a workaround I also tried:
> >
> >> foo <- function(x, dim) { if (missing(dim)) dim <- dim(x); dim }
> >> foo(1)
> > Error in foo(1) : argument "dim" is missing, with no default
> >
> > which surprised me too.
> >
> >
> > For the first case, is the rationale related to:
> >
> >> foo <- function(x, a=dim(x), dim) { a }
> >> foo(1)
> > Error in foo(1) : argument "dim" is missing, with no default
> >
> > and
> >
> >> foo <- function(x, a=dim(x), dim=a) { a }
> >> foo(1)
> > Error in foo(1) :
> >   promise already under evaluation: recursive default argument
> > reference or earlier problems?
> >
> > [since here argument 'dim' could take a function, e.g. foo(1,
> > dim=length)], and that R treats
> >
> > foo <- function(x, dim=dim(x)) { dim }
> >
> > in a similar way?  That is, is R not "clever" enough to detect this as
> > a special case, but instead goes ahead and tries to evaluate the
> > default expression (=dim(x)) of argument 'dim' in order to get its
> > default value?  If so, is there anything preventing R from support
> > this "special case", e.g. by evaluating the default expression without
> > argument/symbol 'dim' itself being in the picture to avoid "it finds
> > itself"?  (Sorry if I'm using the incorrect words here).
>
> No, it shouldn't do that.  It should use consistent rules for evaluation
> or there would be sure to be bugs.
>
> >
> > Yes, I understand that I can do:
> >
> >> foo <- function(x, dim=base::dim(x)) { dim }
>
> This is what you should do.
>
> >> foo(1)
> > NULL
> >
> >> foo <- function(x, dim=NULL) { if (is.null(dim)) dim <- dim(x); dim }
>
> This works, because when R is looking up the function dim(), it can
> evaluate the local argument dim and see it is not a function, so it
> proceeds to the parent frame.
>
> >> foo(1)
> > NULL
> >
> > or
> >
> >> foo <- function(x, dim.=dim(x)) { dim. }
> >> foo(1)
> > NULL
>
> This is another solution that works, but it has the ugly argument name
> now, so you'll get warnings during package checks from calls like
>
> foo(1, dim=2)
>
> >
> > but I would prefer not to have to turn those rather ad hoc solutions in my code.
>
> Nothing ad hoc about the first one.

Thanks for the feedback.  I agree that base::dim(x) is clean and
clear, but unfortunately there is a ~500 times overhead in using '::'.
Since I went through the effort of doing the benchmarking and find
faster solutions, I'm sharing the following:

> library("microbenchmark")

> x <- matrix(1:(80*80), nrow=80)

> # Not "legal", because it calls .Primitive().
> dim_illegal <- base::dim

> dim_R <- function(x) {
+   ns <- getNamespace("base")
+   dim <- get("dim", envir=ns, inherits=FALSE, mode="function")
+   dim(x)
+ }

> dim_R_memoized <- local({
+   dim <- NULL
+   function(x) {
+     if (is.null(dim)) {
+       dim <<- get("dim", envir=getNamespace("base"), inherits=FALSE,
mode="function")
+     }
+     dim(x)
+   }
+ })

> stats <- microbenchmark(
+   dim(x),
+   base::dim(x),
+   dim_R(x),
+   dim_R_memoized(x),
+   dim_illegal(x),
+   sum(x),
+   unit="ns",
+   times=10e3
+ )
Warning message:
In microbenchmark(dim(x), base::dim(x), dim_R(x), dim_R_memoized(x),  :
  Could not measure a positive execution time for 3859 evaluations.

> print(stats)
Unit: nanoseconds
              expr  min   lq       mean median    uq     max neval   cld
            dim(x)    0    0    25.2226      1     1   10780 10000 a
      base::dim(x) 6545 7700 10429.0165   8470 12897 2678155 10000     e
          dim_R(x) 3080 3851  5163.8612   4236  6545   55435 10000   c
 dim_R_memoized(x)  385  771  1238.8292   1156  1541   44656 10000  b
    dim_illegal(x)    0    1    51.4421      1     1    5775 10000 a
            sum(x) 8085 8470  9590.9570   8470 10395   49660 10000    d

Yes, yes, the extra cost of using base::dim(x) is only ~10 us, but if
you do, say, a million bootstrap samples calling this function, that's
an extra unnecessary 10 seconds of processing time.  As a comparison,
the overhead is roughly the same as summing 6400 integers.

For workarounds, I considered:

(a) dim_illegal
(b) dim_R
(c) dim_R_memoized

where,

(a) would be "good enough", but can immediately be discarded because
if used in a package, it will create a copy of base::dim and thereby
call .Primitive() immediately, which is unsafe.

(b) is a poor-mans version of try to cut the corners of '::', but
there is still a substantial overhead in each call, but still a 25-50%
speedup compared to '::'.

(c) is a smarter version of (b) that does the look up only ones, and
managed to reduce the overhead to 10% of '::'.  It's still 50 times
the overhead of a direct dim(x) call.

Since one can byte compile packages (ByteCompile: TRUE in
DESCRIPTION), I've also played around with compiler::cmpfun() and that
prunes off about 10% of the non-compiled ditto.  I was
somewhat/naively hoping that the compiler would be able to compile
base::dim into a "constant", but that doesn't seem to be the case.


BTW, is the following, which is ~2 times as fast as dim_R_memoized(),
valid in an R package?   Will it set the local 'dim' variable when the
package is loaded, which I assume is safe/legal, or before?  I didn't
include it above, because I wasn't sure it was safe/valid.

dim_R_memoized_2 <- local({
  dim_local <- base::dim
  function(x) dim_local(x)
})


Thanks,

/Henrik


>
> Duncan Murdoch
>
> >
> >
> > Thanks,
> >
> > Henrik
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>



More information about the R-devel mailing list