[Rd] [External] Mitigating Stalls Caused by Call Deparse on Error

Tierney, Luke |uke-t|erney @end|ng |rom u|ow@@edu
Mon Jul 15 16:04:16 CEST 2019


Better to add this to the wishlist item. This all needs to be looked
at together, and nothing is likely to happen until after
vacation/conference season.  It will disappear from everyone's radar
if it is just in R_devel.

Best,

luke

On Sun, 14 Jul 2019, brodie gaslam wrote:

> Luke, thanks for considering the issue.  I would like to
> try to separate the problem into two parts, as I _think_
> your comments address primarily part 2 below:
>
> 1. How can we avoid significant and possibly crippling
>    stalls on error with these non-standard calls.
> 2. What is the best way to view these non-standard calls.
>
> I agree that issue 2. requires further thought and
> discussion under a wishlist issue ([on bugzilla now][1]). 
> While I did raise issue 2., the patch itself makes no
> attempt to resolve it.
>
> The proposed patch resolves issue 1., which is a big
> usability problem.  Right now if you have the misfortune of
> using `do.call` with a big object and trigger an error, you
> have the choice of waiting a possibly long time for
> the deparse to complete, or killing your entire R session
> externally.
>
> It seems a shame to allow a big usability issue for `do.call`
> to remain when there is a simple solution at hand, especially
> since the complete deparse of large objects likely serves no
> purpose in this case. Obviously, if storing the actual calls
> instead of their deparsed equivalents in .Traceback causes
> problems I'm not anticipating, then that's different. 
> Is that the case?
>
> Best,
>
> Brodie.
>
> [1]: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17580
>
> On Sunday, July 14, 2019, 8:52:45 AM EDT, Tierney, Luke <luke-tierney using uiowa.edu> wrote:
>
>
>
>
>
> This is probably best viewed in the context of other issue with
> displaying calls, such as issues arising from calls constructed in
> non-standard evaluation contexts. Might be good to move to a wishlist
> item in bugzilla.
>
> Best,
>
> luke
>
> On Sat, 13 Jul 2019, brodie gaslam via R-devel wrote:
>
>> When large calls cause errors R may stall for extended periods.  This
>> is particularly likely to happen with `do.call`, as in this example
>> with a 24 second stall:
>>
>>     x <- runif(1e7)
>>     system.time(do.call(paste0, list(abs, x)))  # intentional error
>>     ## Error in (function (..., collapse = NULL)  :
>>     ##   cannot coerce type 'builtin' to vector of type 'character'
>>     ## Calls: system.time -> do.call -> <Anonymous>
>>     ## Timing stopped at: 23.81 0.149 24.04
>>
>>     str(.Traceback)
>>     ## Dotted pair list of 3
>>     ##  $ : chr [1:2500488] "(function (..., collapse = NULL) " ".Internal(paste0(list(...), collapse)))(.Primitive(\"abs\"), c(0.718117154669017, " "0.494785501621664, 0.1453434410505, 0.635028422810137, 0.0353180423844606, " "0.688418723642826, 0.889682895969599, 0.728154224809259, 0.292572240810841, " ...
>>     ##  $ : chr "do.call(paste0, list(abs, x))"
>>     ##  $ : chr "system.time(do.call(paste0, list(abs, x)))"
>>
>> The first time I noticed this I thought my session had frozen/crashed
>> as the standard interrupt ^C does not work during the deparse.  The
>> stall happens when on error the call stack is deparsed prior to being
>> saved to `.Traceback`.  The deparsing is done by `deparse1m` in native
>> code, with the value of `getOption('deparse.max.lines')` which
>> defaults to all lines.
>>
>> Since there is little value to seeing millions of lines of deparsed
>> objects in `traceback()`, a simple work-around is to change the
>> `deparse.max.lines` value:
>>
>>     options(deparse.max.lines=1)
>>     system.time(do.call(paste0, list(abs, x)))
>>     ## Error in (function (..., collapse = NULL)  :
>>     ##   cannot coerce type 'builtin' to vector of type 'character'
>>     ## Calls: system.time -> do.call -> <Anonymous>
>>     ## Timing stopped at: 0 0 0
>>
>> Unfortunately this will affect all `deparse` calls, and it seems
>> undesirable to pre-emptively enable it just for calls that might cause
>> large deparses on error.
>>
>> An alternative is to store the actual calls instead of their deparsed
>> character equivalents in `.Traceback`.  This defers the deparsing to
>> when `traceback()` is used.  As per `?traceback`, it should be
>> relatively safe to modify `.Traceback` in this way:
>>
>>> It is undocumented where .Traceback is stored nor that it is
>>> visible, and this is subject to change.
>>
>> Deferring the deparsing to `traceback()` will give us the 
>> opportunity to use a different `max.lines` setting as we do here
>> with the patch applied:
>>
>>     system.time(do.call(paste0, list(abs, x)))
>>     ## Error in (function (..., collapse = NULL)  :
>>     ##   cannot coerce type 'builtin' to vector of type 'character'
>>     ## Timing stopped at: 0.028 0 0.029
>>
>>     system.time(traceback(max.lines=3))
>>     ## 3: (function (..., collapse = NULL)
>>     ##    .Internal(paste0(list(...), collapse)))(.Primitive("abs"), c(0.535468587651849,
>>     ##    0.0540027911774814, 0.732930393889546, 0.565360915614292, 0.544816034380347,
>>     ##     ...
>>     ## 2: do.call(paste0, list(abs, x))
>>     ## 1: system.time(do.call(paste0, list(abs, x)))
>>     ##    user  system elapsed
>>     ##   0.000   0.000   0.003
>>
>>
>> More generally, it might be better to have a different smaller default
>> value for the lines to deparse when calls  are _displayed_ as parts of
>> lists, as is the case with `traceback()`, or in `print(sys.calls())` and
>> similar.
>>
>> I attach a patch that does this.  I have run some basic tests
>> and `make check-devel` passes. I can file an issue on bugzilla
>> if that is a better place to have this conversation (assuming there
>> is interest in it).
>>
>> Best,
>>
>> Brodie
>>
>> PS: for some reason my mail client is refusing to attach the patch so I paste it
>> starting on the next line.
>> Index: src/gnuwin32/Rdll.hide
>> ===================================================================
>> --- src/gnuwin32/Rdll.hide    (revision 76827)
>> +++ src/gnuwin32/Rdll.hide    (working copy)
>> @@ -94,6 +94,7 @@
>>   R_GetMaxNSize
>>   R_GetMaxVSize
>>   R_GetTraceback
>> + R_GetTracebackParsed
>>   R_GetVarLocSymbol
>>   R_GetVarLocValue
>>   R_HandlerStack
>> Index: src/include/Defn.h
>> ===================================================================
>> --- src/include/Defn.h    (revision 76827)
>> +++ src/include/Defn.h    (working copy)
>> @@ -1296,6 +1296,7 @@
>>  void NORET ErrorMessage(SEXP, int, ...);
>>  void WarningMessage(SEXP, R_WARNING, ...);
>>  SEXP R_GetTraceback(int);
>> +SEXP R_GetTracebackParsed(int);
>>  
>>  R_size_t R_GetMaxVSize(void);
>>  void R_SetMaxVSize(R_size_t);
>> Index: src/library/base/R/traceback.R
>> ===================================================================
>> --- src/library/base/R/traceback.R    (revision 76827)
>> +++ src/library/base/R/traceback.R    (working copy)
>> @@ -16,9 +16,19 @@
>>  #  A copy of the GNU General Public License is available at
>>  #  https://www.R-project.org/Licenses/
>>  
>> -.traceback <- function(x = NULL) {
>> -    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv())))
>> -    {}
>> +.traceback <- function(x = NULL, max.lines=getOption("deparse.max.lines")) {
>> +    if(!(is.numeric(max.lines) && !is.na(max.lines) &&
>> +         as.integer(max.lines) > 0L)
>> +    ) {
>> +        max.lines <- -1L
>> +    }
>> +    if(is.null(x) && !is.null(x <- get0(".Traceback", envir = baseenv()))) {
>> +        for(i in seq_along(x)) {
>> +            srcref <- attr(x[[i]], 'srcref')
>> +            x[[i]] <- deparse(x[[i]], nlines=max.lines)
>> +            attr(x[[i]], 'srcref') <- srcref
>> +        }
>> +    }
>>      else if (is.numeric(x))
>>          x <- .Internal(traceback(x))
>>      x
>> @@ -26,7 +36,16 @@
>>  
>>  traceback <- function(x = NULL, max.lines = getOption("deparse.max.lines"))
>>  {
>> -    n <- length(x <- .traceback(x))
>> +    valid.max.lines <- is.numeric(max.lines) && !is.na(max.lines) &&
>> +         as.integer(max.lines) > 0L
>> +
>> +    if(valid.max.lines) {
>> +        # max.lines + 1L so we can know that output was truncated by .traceback
>> +        max.lines.2 <- as.integer(max.lines) + 1L
>> +    } else {
>> +        max.lines.2 <- -1L
>> +    }
>> +    n <- length(x <- .traceback(x, max.lines=max.lines.2))
>>      if(n == 0L)
>>          cat(gettext("No traceback available"), "\n")
>>      else {
>> @@ -40,7 +59,7 @@
>>                  paste0(" at ", basename(srcfile$filename), "#", srcref[1L])
>>              }
>>              ## Truncate deparsed code (destroys attributes of xi)
>> -            if(is.numeric(max.lines) && max.lines > 0L && max.lines < m) {
>> +            if(valid.max.lines &&  max.lines < m) {
>>                  xi <- c(xi[seq_len(max.lines)], " ...")
>>                  m <- length(xi)
>>              }
>> Index: src/library/base/man/traceback.Rd
>> ===================================================================
>> --- src/library/base/man/traceback.Rd    (revision 76827)
>> +++ src/library/base/man/traceback.Rd    (working copy)
>> @@ -21,7 +21,7 @@
>>  }
>>  \usage{
>>  traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
>> -.traceback(x = NULL)
>> +.traceback(x = NULL, max.lines = getOption("deparse.max.lines"))
>>  }
>>  \arguments{
>>    \item{x}{\code{NULL} (default, meaning \code{.Traceback}), or an
>> Index: src/main/errors.c
>> ===================================================================
>> --- src/main/errors.c    (revision 76827)
>> +++ src/main/errors.c    (working copy)
>> @@ -1008,7 +1008,7 @@
>>         (which should not happen) */
>>      if (traceback && inError < 2 && inError == oldInError) {
>>          inError = 2;
>> -        PROTECT(s = R_GetTraceback(0));
>> +        PROTECT(s = R_GetTracebackParsed(0));
>>          SET_SYMVALUE(install(".Traceback"), s);
>>          /* should have been defineVar
>>             setVar(install(".Traceback"), s, R_GlobalEnv); */
>> @@ -1440,9 +1440,11 @@
>>      PrintWarnings();
>>      }
>>  }
>> -
>> +/*
>> + * Return the traceback without deparsing the calls
>> + */
>>  attribute_hidden
>> -SEXP R_GetTraceback(int skip)
>> +SEXP R_GetTracebackParsed(int skip)
>>  {
>>      int nback = 0, ns;
>>      RCNTXT *c;
>> @@ -1467,7 +1469,9 @@
>>          if (skip > 0)
>>          skip--;
>>          else {
>> -        SETCAR(t, deparse1m(c->call, 0, DEFAULTDEPARSE));
>> +                // Extra paranoid PROTECTS
>> +        SETCAR(t, PROTECT(duplicate(c->call)));
>> +                UNPROTECT(1);
>>          if (c->srcref && !isNull(c->srcref)) {
>>              SEXP sref;
>>              if (c->srcref == R_InBCInterpreter)
>> @@ -1482,7 +1486,26 @@
>>      UNPROTECT(1);
>>      return s;
>>  }
>> +/*
>> + * Return the traceback with calls deparsed
>> + */
>> +attribute_hidden
>> +SEXP R_GetTraceback(int skip)
>> +{
>> +    int nback = 0;
>> +    SEXP s, t, u, v;
>> +    s = PROTECT(R_GetTracebackParsed(skip));
>> +    for(t = s; t != R_NilValue; t = CDR(t)) nback++;
>> +    u = v = PROTECT(allocList(nback));
>>  
>> +    for(t = s; t != R_NilValue; t = CDR(t), v=CDR(v)) {
>> +        SETCAR(v, PROTECT(deparse1m(CAR(t), 0, DEFAULTDEPARSE)));
>> +        UNPROTECT(1);
>> +    }
>> +    UNPROTECT(1);
>> +    return u;
>> +}
>> +
>>  SEXP attribute_hidden do_traceback(SEXP call, SEXP op, SEXP args, SEXP rho)
>>  {
>>      int skip;
>
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list