[Rd] we need an exists/get hybrid

Peter Haverty haverty.peter at gene.com
Wed Dec 3 03:46:39 CET 2014


Hi All,

I've been looking into speeding up the loading of packages that use a lot
of S4.  After profiling I noticed the "exists" function accounts for a
surprising fraction of the time.  I have some thoughts about speeding up
exists (below). More to the point of this post, Martin Mächler noted that
'exists' and 'get' are often used in conjunction.  Both functions are
different usages of the do_get C function, so it's a pity to run that twice.

"get" gives an error when a symbol is not found, so you can't just do a
'get'.  With R's C library, one might do

SEXP x = findVarInFrame3(symbol,env);
if (x != R_UnboundValue) {
    // do stuff with x
}

It would be very convenient to have something like this at the R level. We
don't want to do any tryCatch stuff or to add args to get (That would kill
any speed advantage. The overhead for handling redundant args accounts for
30% of the time used by "exists").  Michael Lawrence and I worked out that
we need a function that returns either the desired object, or something
that represents R_UnboundValue. We also need a very cheap way to check if
something equals this new R_UnboundValue. This might look like

if (defined(x <- fetch(symbol, env))) {
  do_stuff_with_x(x)
}

A few more thoughts about "exists":

Moving the bit of R in the exists function to C saves 10% of the time.
Dropping the redundant pos and frame args entirely saves 30% of the time
used by this function. I suggest that the arguments of both get and
exists should
be simplified to (x, envir, mode, inherits). The existing C code handles
numeric, character, and environment input for where. The arg frame is
rarely used (0/128 exists calls in the methods package). Users that need to
can call sys.frame themselves. get already lacks a frame argument and the
manpage for exists notes that envir is only there for backwards
compatibility. Let's deprecate the extra args in exists and get and perhaps
move the extra argument handling to C in the interim.  Similarly, the
"assign" function does nothing with the "immediate" argument.

I'd be interested to hear if there is any support for a "fetch"-like
function (and/or deprecating some unused arguments).

All the best,
Pete



Pete

____________________
Peter M. Haverty, Ph.D.
Genentech, Inc.
phaverty at gene.com

	[[alternative HTML version deleted]]



More information about the R-devel mailing list