[R] Parallel assignments and goto

Sun Feb 11 16:48:40 CET 2018

Hi guys,

I am working on some code for automatically translating recursive functions into looping functions to implemented tail-recursion optimisations. See https://github.com/mailund/tailr

As a toy-example, consider the factorial function

factorial <- function(n, acc = 1) {
    if (n <= 1) acc
    else factorial(n - 1, acc * n)
}

I can automatically translate this into the loop-version

factorial_tr_1 <- function (n, acc = 1) 
{
    repeat {
        if (n <= 1) 
            return(acc)
        else {
            .tailr_n <- n - 1
            .tailr_acc <- acc * acc
            n <- .tailr_n
            acc <- .tailr_acc
            next
        }
    }
}

which will run faster and not have problems with recursion depths. However, I’m not entirely happy with this version for two reasons: I am not happy with introducing the temporary variables and this rewrite will not work if I try to over-scope an evaluation context.

I have two related questions, one related to parallel assignments — i.e. expressions to variables so the expression uses the old variable values and not the new values until the assignments are all done — and one related to restarting a loop from nested loops or from nested expressions in `with` expressions or similar.

I can implement parallel assignment using something like rlang::env_bind:

factorial_tr_2 <- function (n, acc = 1) 
{
    .tailr_env <- rlang::get_env()
    repeat {
        if (n <= 1) 
            return(acc)
        else {
            rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
            next
        }
    }
}

This reduces the number of additional variables I need to one, but is a couple of orders of magnitude slower than the first version.

> microbenchmark::microbenchmark(factorial(100),
+                                factorial_tr_1(100),
+                                factorial_tr_2(100))
Unit: microseconds
                     expr      min       lq       mean    median       uq      max neval
      factorial(100)   53.978   60.543   77.76203   71.0635   85.947  180.251   100
 factorial_tr_1(100)    9.022    9.903   11.52563   11.0430   11.984   28.464   100
 factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463 8177.635   100

Is there another way to do parallel assignments that doesn’t cost this much in running time?

My other problem is the use of `next`. I would like to combine tail-recursion optimisation with pattern matching as in https://github.com/mailund/pmatch where I can, for example, define a linked list like this:

devtools::install_github("mailund/pmatch”)
library(pmatch)
llist := NIL | CONS(car, cdr : llist)

and define a function for computing the length of a list like this:

list_length <- function(lst, acc = 0) {
  force(acc)
  cases(lst,
        NIL -> acc,
        CONS(car, cdr) -> list_length(cdr, acc + 1))
}

The `cases` function creates an environment that binds variables in a pattern-description that over-scopes the expression to the right of `->`, so the recursive call in this example have access to the variables `cdr` and `car`.

I can transform a `cases` call to one that creates the environment containing the bound variables and then evaluate this using `eval` or `with`, but in either case, a call to `next` will not work in such a context. The expression will be evaluated inside `bind` or `with`, and not in the `list_lenght` function.

A version that *will* work, is something like this

factorial_tr_3 <- function (n, acc = 1) 
{
    .tailr_env <- rlang::get_env()
    .tailr_frame <- rlang::current_frame()
    repeat {
        if (n <= 1) 
            rlang::return_from(.tailr_frame, acc)
        else {
            rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
            rlang::return_to(.tailr_frame)
        }
    }
}

Here, again, for the factorial function since this is easier to follow than the list-length function.

This solution will also work if you return values from inside loops, where `next` wouldn’t work either.

Using `rlang::return_from` and `rlang::return_to` implements the right semantics, but costs me another order of magnitude in running time.

microbenchmark::microbenchmark(factorial(100),
                               factorial_tr_1(100),
                               factorial_tr_2(100),
                               factorial_tr_3(100))
Unit: microseconds
                expr       min         lq        mean     median        uq        max neval
      factorial(100)    52.479    60.2640    93.43069    67.5130    83.925   2062.481   100
 factorial_tr_1(100)     8.875     9.6525    49.19595    10.6945    11.217   3818.823   100
 factorial_tr_2(100)  5296.350  5525.0745  5973.77664  5737.8730  6260.128   8471.301   100
 factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725 89859.169 171039.228   100

I can live with the “introducing extra variables” solution to parallel assignment, and I could hack my way out of using `with` or `bind` in rewriting `cases`, but restarting a `repeat` loop would really make for a nicer solution. I know that `goto` is considered harmful, but really, in this case, it is what I want.

A `callCC` version also solves the problem

factorial_tr_4 <- function(n, acc = 1) {
    function_body <- function(continuation) {
        if (n <= 1) {
            continuation(acc)
        } else {
            continuation(list("continue", n = n - 1, acc = acc * n))
        }
    }
    repeat {
        result <- callCC(function_body)
        if (is.list(result) && result[[1]] == "continue") {
            n <- result$n
            acc <- result$acc
            next
        } else {
            return(result)
        }
    }
}

But this requires that I know how to distinguish between a valid return value and a tag for “next” and is still a lot slower than the `next` solution

microbenchmark::microbenchmark(factorial(100),
                               factorial_tr_1(100),
                               factorial_tr_2(100),
                               factorial_tr_3(100),
                               factorial_tr_4(100))
Unit: microseconds
                expr       min         lq        mean     median        uq        max neval
      factorial(100)    54.109    61.8095    81.33167    81.8785    89.748    243.554   100
 factorial_tr_1(100)     9.025     9.9035    11.38607    11.1990    12.008     22.375   100
 factorial_tr_2(100)  5272.524  5798.3965  6302.40467  6077.7180  6492.959   9967.237   100
 factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665 75405.054 203785.673   100
 factorial_tr_4(100)   270.978   302.7890   337.48763   313.9930   334.096   1425.702   100

I don’t necessarily need the tail-recursion optimisation to be faster than the recursive version; just getting out of the problem of too deep recursions is a benefit, but I would rather not pay with an order of magnitude for it. I could, of course, try to handle cases that works with `next` in one way, and other cases using `callCC`, but I feel it should be possible with a version that handles all cases the same way.

Is there any way to achieve this?

Cheers
	Thomas