[Rd] Very hard to reproduce bug (?) in R-devel

Mon Apr 3 18:36:21 CEST 2017

Small correction to my previous email: the error can also be
replicated by installing the package and then sourcing the tests. This
is slightly faster than running R CMD check. It is also how we ran the
tests with gdb. This will run the tests in a Docker container and
raise the same error. (To run outside of a Docker container, simply
leave out the first line).

  docker run --rm -ti pooltest /bin/bash
  RD CMD INSTALL pool_0.1.0.tar.gz  --install-tests
  cd /usr/local/lib/R/site-library/pool/tests
  RD -e 'source("testthat.R")'

At the suggestion of Tomas Kalibera, I ran the tests with
gctorture(TRUE). When I do this, I get three errors instead of just
one:

======================
# RD -e 'gctorture(TRUE); source("testthat.R")'

R Under development (unstable) (2017-03-23 r72389) -- "Unsuffered Consequences"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> gctorture(TRUE); source("testthat.R")
1. Failure: pool: checks for leaks (named) (@test-leak-detection.R#21) ---------
gc() showed 0 warnings

2. Failure: pool scheduler: schedules things in the right order
(@test-scheduling.R#24)
`results` not identical to 1:3.
2/3 mismatches (average diff: 1)
[1] 2 - 1 ==  1
[2] 1 - 2 == -1

3. Failure: pool scheduler: works with pool (@test-scheduling.R#31) ------------
pool$counters$free not identical to 1.
1/1 mismatches
[1] 0 - 1 == -1

testthat results
================================================================
OK: 199 SKIPPED: 0 FAILED: 3
1. Failure: pool: checks for leaks (named) (@test-leak-detection.R#21)
2. Failure: pool scheduler: schedules things in the right order
(@test-scheduling.R#24)
3. Failure: pool scheduler: works with pool (@test-scheduling.R#31)

Error: testthat unit tests failed
Execution halted

======================

On Mon, Apr 3, 2017 at 10:34 AM, Winston Chang <winstonchang1 at gmail.com> wrote:
> When running R CMD check on a package, we are encountering an error on
> R-devel (as of 72457) on Linux. Unfortunately, it is very hard to
> reproduce, and almost any change to the code makes the error go away.
> I believe that this is due to a bug in R-devel, which has been present
> since at least commit 72128 (on 2017-02-06).
>
> The test error occurs when R CMD check is run on Travis CI (on Ubuntu
> 12.04.05), and locally on R-devel (on Debian testing). If we make any
> of the following changes, the error does not occur, and the test
> passes:
> - Run the test scripts from an interactive R session
> - Run on a Mac instead of Linux
> - Use R 3.3.3 or 3.2.5
> - Add code right before the tests. For example, just running gc() or
> format(1) right before the test causes it to not fail. However, adding
> very simple code like 1+1 does not affect the test (it still fails).
> - Run R CMD check with -d gdb or -d valgrind
>
> Because the test is so sensitive to any changes, we've been unable to
> track down the source of the problem. We can't be sure, but the error
> message suggests that a promise is getting evaluated with the wrong
> environment.
>
> Here's the relevant output from the tests:
> 1. Error: pool scheduler: schedules things in the right order
> (@test-scheduling.R#13)
> could not find function "task"
> 1: naiveScheduler$protect({
>        scheduleTask(1000, function() {
>            results <<- c(results, 3L)
>        })
>        scheduleTask(100, function() {
>            results <<- c(results, 2L)
>        })
>        scheduleTask(10, function() {
>            results <<- c(results, 1L)
>        })
>    }) at testthat/test-scheduling.R:13
> 2: private$refCount$release() at testthat/test-scheduling.R:13
> 3: private$callback()
> 4: self$protect({
>        task()
>    })
> 5: private$refCount$release()
> 6: private$callback()
> 7: self$protect({
>        task()
>    })
> 8: force(expr)
>
> You can see complete logs on Travis here:
>   https://travis-ci.org/rstudio/pool/jobs/217301804#L747-L771
>
> In the test, a re-entrant function is called (self$protect, at levels
> 4 and 7 on the call stack), and in the inner call (7), a variable,
> `task`, does not have the correct value. It should be a function, but
> it isn't; it's either NULL or some other non-function value.
>
> That code is here:
>   https://github.com/rstudio/pool/blob/f7e52a3/R/scheduler.R#L79-L81
> Note that there's a `force(task)` a couple lines above, which is
> commented out. If that is uncommented, the test passes, even though
> calling force() there shouldn't make a difference. Also, calling
> `format(1)` at that location also causes the test to pass.
>
>
> I've written instructions to reproduce the problem in a Docker container here:
>   https://gist.github.com/wch/4cd0c56757886c63522049a32b3c294f
>
> If you want to try reproducing the error without using Docker, I
> believe if you download pool_0.1.0.tar.gz file from the
> gist.github.com URL above, and run R CMD check on it, that will do it.
> Some upstream dependencies will likely need to be installed; they are
> listed in the code in the gist.
>
>
> We also used git bisect to try to find the first commit in R where the
> test would fail. The instructions for doing this are also in the gist
> linked to above. It found that the first commit where the test failed
> was 72128:
>   https://github.com/wch/r-source/commit/7c950ac11202cabc25b276c2f4da82226644953d
>
> Taking a look at the changes in that commit, however, it seems very
> unlikely that it caused the error. Instead, it probably just made a
> change which allowed an already-existing bug to manifest.
>
>
> We'd appreciate any help finding the cause of this bug!
> -Winston