[Rd] Mitigating Stalls Caused by Call Deparse on Error

brodie gaslam brod|e@g@@|@m @end|ng |rom y@hoo@com
Sat Jul 13 19:31:44 CEST 2019


When large calls cause errors R may stall for extended periods.  This
is particularly likely to happen with `do.call`, as in this example 
with a 24 second stall:

    x <- runif(1e7)
    system.time(do.call(paste0, list(abs, x)))  # intentional error
    ## Error in (function (..., collapse = NULL)  : 
    ##   cannot coerce type 'builtin' to vector of type 'character'
    ## Calls: system.time -> do.call -> <Anonymous>
    ## Timing stopped at: 23.81 0.149 24.04

    str(.Traceback)
    ## Dotted pair list of 3
    ##  $ : chr [1:2500488] "(function (..., collapse = NULL) " ".Internal(paste0(list(...), collapse)))(.Primitive(\"abs\"), c(0.718117154669017, " "0.494785501621664, 0.1453434410505, 0.635028422810137, 0.0353180423844606, " "0.688418723642826, 0.889682895969599, 0.728154224809259, 0.292572240810841, " ...
    ##  $ : chr "do.call(paste0, list(abs, x))"
    ##  $ : chr "system.time(do.call(paste0, list(abs, x)))"

The first time I noticed this I thought my session had frozen/crashed
as the standard interrupt ^C does not work during the deparse.  The
stall happens when on error the call stack is deparsed prior to being
saved to `.Traceback`.  The deparsing is done by `deparse1m` in native
code, with the value of `getOption('deparse.max.lines')` which
defaults to all lines.

Since there is little value to seeing millions of lines of deparsed
objects in `traceback()`, a simple work-around is to change the
`deparse.max.lines` value:

    options(deparse.max.lines=1)
    system.time(do.call(paste0, list(abs, x)))
    ## Error in (function (..., collapse = NULL)  : 
    ##   cannot coerce type 'builtin' to vector of type 'character'
    ## Calls: system.time -> do.call -> <Anonymous>
    ## Timing stopped at: 0 0 0

Unfortunately this will affect all `deparse` calls, and it seems
undesirable to pre-emptively enable it just for calls that might cause
large deparses on error.

An alternative is to store the actual calls instead of their deparsed
character equivalents in `.Traceback`.  This defers the deparsing to 
when `traceback()` is used.  As per `?traceback`, it should be
relatively safe to modify `.Traceback` in this way:

> It is undocumented where .Traceback is stored nor that it is
> visible, and this is subject to change.

Deferring the deparsing to `traceback()` will give us the 
opportunity to use a different `max.lines` setting as we do here 
with the patch applied:

    system.time(do.call(paste0, list(abs, x)))
    ## Error in (function (..., collapse = NULL)  : 
    ##   cannot coerce type 'builtin' to vector of type 'character'
    ## Timing stopped at: 0.028 0 0.029

    system.time(traceback(max.lines=3))
    ## 3: (function (..., collapse = NULL) 
    ##    .Internal(paste0(list(...), collapse)))(.Primitive("abs"), c(0.535468587651849, 
    ##    0.0540027911774814, 0.732930393889546, 0.565360915614292, 0.544816034380347, 
    ##     ...
    ## 2: do.call(paste0, list(abs, x))
    ## 1: system.time(do.call(paste0, list(abs, x)))
    ##    user  system elapsed 
    ##   0.000   0.000   0.003 


More generally, it might be better to have a different smaller default
value for the lines to deparse when calls  are _displayed_ as parts of 
lists, as is the case with `traceback()`, or in `print(sys.calls())` and
similar.

I attach a patch that does this.  I have run some basic tests 
and `make check-devel` passes. I can file an issue on bugzilla 
if that is a better place to have this conversation (assuming there 
is interest in it).

Best,

Brodie

PS: for some reason my mail client is refusing to attach the patch so I paste it
starting on the next line.
Index: src/gnuwin32/Rdll.hide
===================================================================
--- src/gnuwin32/Rdll.hide    (revision 76827)
+++ src/gnuwin32/Rdll.hide    (working copy)
@@ -94,6 +94,7 @@
  R_GetMaxNSize
  R_GetMaxVSize
  R_GetTraceback
+ R_GetTracebackParsed
  R_GetVarLocSymbol
  R_GetVarLocValue
  R_HandlerStack
Index: src/include/Defn.h
===================================================================
--- src/include/Defn.h    (revision 76827)
+++ src/include/Defn.h    (working copy)
@@ -1296,6 +1296,7 @@
 void NORET ErrorMessage(SEXP, int, ...);
 void WarningMessage(SEXP, R_WARNING, ...);
 SEXP R_GetTraceback(int);
+SEXP R_GetTracebackParsed(int);
 
 R_size_t R_GetMaxVSize(void);
 void R_SetMaxVSize(R_size_t);
Index: src/library/base/R/traceback.R
===================================================================
--- src/library/base/R/traceback.R    (revision 76827)
+++ src/library/base/R/traceback.R    (working copy)
@@ -16,9 +16,19 @@
 #  A copy of the GNU General Public License is available at
 #  https://www.R-project.org/Licenses/
 
-.traceback <- function(x = NULL) {
-    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv())))
-    {}
+.traceback <- function(x = NULL, max.lines=getOption("deparse.max.lines")) {
+    if(!(is.numeric(max.lines) && !is.na(max.lines) &&
+         as.integer(max.lines) > 0L)
+    ) {
+        max.lines <- -1L
+    }
+    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv()))) {
+        for(i in seq_along(x)) {
+            srcref <- attr(x[[i]], 'srcref')
+            x[[i]] <- deparse(x[[i]], nlines=max.lines)
+            attr(x[[i]], 'srcref') <- srcref
+        }
+    }
     else if (is.numeric(x))
         x <- .Internal(traceback(x))
     x
@@ -26,7 +36,16 @@
 
 traceback <- function(x = NULL, max.lines = getOption("deparse.max.lines"))
 {
-    n <- length(x <- .traceback(x))
+    valid.max.lines <- is.numeric(max.lines) && !is.na(max.lines) &&
+         as.integer(max.lines) > 0L
+
+    if(valid.max.lines) {
+        # max.lines + 1L so we can know that output was truncated by .traceback
+        max.lines.2 <- as.integer(max.lines) + 1L
+    } else {
+        max.lines.2 <- -1L
+    }
+    n <- length(x <- .traceback(x, max.lines=max.lines.2))
     if(n == 0L)
         cat(gettext("No traceback available"), "\n")
     else {
@@ -40,7 +59,7 @@
                 paste0(" at ", basename(srcfile$filename), "#", srcref[1L])
             }
             ## Truncate deparsed code (destroys attributes of xi)
-            if(is.numeric(max.lines) && max.lines > 0L && max.lines < m) {
+            if(valid.max.lines &&  max.lines < m) {
                 xi <- c(xi[seq_len(max.lines)], " ...")
                 m <- length(xi)
             }
Index: src/library/base/man/traceback.Rd
===================================================================
--- src/library/base/man/traceback.Rd    (revision 76827)
+++ src/library/base/man/traceback.Rd    (working copy)
@@ -21,7 +21,7 @@
 }
 \usage{
 traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
-.traceback(x = NULL)
+.traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
 }
 \arguments{
   \item{x}{\code{NULL} (default, meaning \code{.Traceback}), or an
Index: src/main/errors.c
===================================================================
--- src/main/errors.c    (revision 76827)
+++ src/main/errors.c    (working copy)
@@ -1008,7 +1008,7 @@
        (which should not happen) */
     if (traceback && inError < 2 && inError == oldInError) {
         inError = 2;
-        PROTECT(s = R_GetTraceback(0));
+        PROTECT(s = R_GetTracebackParsed(0));
         SET_SYMVALUE(install(".Traceback"), s);
         /* should have been defineVar
            setVar(install(".Traceback"), s, R_GlobalEnv); */
@@ -1440,9 +1440,11 @@
     PrintWarnings();
     }
 }
-
+/*
+ * Return the traceback without deparsing the calls
+ */
 attribute_hidden
-SEXP R_GetTraceback(int skip)
+SEXP R_GetTracebackParsed(int skip)
 {
     int nback = 0, ns;
     RCNTXT *c;
@@ -1467,7 +1469,9 @@
         if (skip > 0)
         skip--;
         else {
-        SETCAR(t, deparse1m(c->call, 0, DEFAULTDEPARSE));
+                // Extra paranoid PROTECTS
+        SETCAR(t, PROTECT(duplicate(c->call)));
+                UNPROTECT(1);
         if (c->srcref && !isNull(c->srcref)) {
             SEXP sref;
             if (c->srcref == R_InBCInterpreter)
@@ -1482,7 +1486,26 @@
     UNPROTECT(1);
     return s;
 }
+/*
+ * Return the traceback with calls deparsed
+ */
+attribute_hidden
+SEXP R_GetTraceback(int skip)
+{
+    int nback = 0;
+    SEXP s, t, u, v;
+    s = PROTECT(R_GetTracebackParsed(skip));
+    for(t = s; t != R_NilValue; t = CDR(t)) nback++;
+    u = v = PROTECT(allocList(nback));
 
+    for(t = s; t != R_NilValue; t = CDR(t), v=CDR(v)) {
+        SETCAR(v, PROTECT(deparse1m(CAR(t), 0, DEFAULTDEPARSE)));
+        UNPROTECT(1);
+    }
+    UNPROTECT(1);
+    return u;
+}
+
 SEXP attribute_hidden do_traceback(SEXP call, SEXP op, SEXP args, SEXP rho)
 {
     int skip;



More information about the R-devel mailing list