[Rd] Generate reproducible output independently of the build path

Ximin Luo infinity0 at pwned.gg
Wed May 3 15:10:00 CEST 2017


Ximin Luo:
> [..]
> 
> I've attached a patch (applies to both 3.3.3 and 3.4) that fixes this issue; however I know it's not perfect and would welcome feedback on how to make it acceptable to the R project.
> 

Hi all, attached is an updated version of the patch.

We've tested this on our jenkins infrastructure and it makes 463/478 Debian R packages reproducible:

https://tests.reproducible-builds.org/debian/issues/unstable/randomness_in_r_rdb_rds_databases_issue.html

The previous version of the patch was slightly flawed, it made 2 of these packages fail-to-build-from-source (r-bioc-biobase, r-cran-shinybs). This is fixed in the current patch attached, and these packages reproduce with it.

The remaining FTBFS (r-cran-randomfields) are due to incompatibilities between r-base 3.3.3 and 3.4.0, being discussed in Debian bug 861333, and are not caused by this patch.

The remaining 14 unreproducible packages are likely unreproducible due to issues specifically in those packages. For example r-cran-runit-0.4.31/man/checkFuncs.Rd contains an explicit absolute path, and making this relative fixes the unreproducibility. I have not yet checked the other packages.

> For example, I've tried to limit the effects of the patch only to the RDB loading/saving code, but I'm not familiar with the codebase so it would be good if someone could verify that I did this correctly. Then, ideally we would also add some tests to ensure that unreproduciblity does not crop back in "by accident". R code heavily relies on absolute paths, and I went down several dead-ends chasing and editing variables containing absolute paths, before I finally managed to get this working patch, so I suspect that without specific reproducibility tests, this issue might recur in the future.
> 

I've been talking with Dirk Eddelbuettel off-thread and he suggested that the rest of the patch could also be guarded by something like getOption("useRelativePath", bool).

It would be good if other members of R Core could comment and give me some more guidance along these lines. :)

> I've checked that the existing tests still pass, with this patch applied to the Debian package. I have some errors like:
> [..]
> :* checking whether the package can be loaded ... ERROR
> [..]

We also figured out that this was a previous issue with Debian R 3.3.3, the error goes away with 3.4.0 either patched or unpatched.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r-base_reproducible-build-paths.patch
Type: text/x-diff
Size: 2487 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20170503/3e8b53c6/attachment.bin>


More information about the R-devel mailing list