[R] Reproducibility Between Local and Remote Computer with R

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Sun Aug 9 01:13:19 CEST 2020


On 08/08/2020 9:34 a.m., Marc Schwartz via R-help wrote:
> Hi,
> 
> I was initially going to think that the change in the RNG might be the source, however, that change was made in 3.6.0 and would have applied to runif() and sample():
> 
> "sample.kind can be "Rounding" or "Rejection", or partial matches to these. The former was the default in versions prior to 3.6.0: it made sample noticeably non-uniform on large populations, and should only be used for reproduction of old results. See PR#17494 for a discussion."
> 

That still may be an issue.  If a user saves a workspace in an old 
version and reloads it in a newer version, I believe they get the old 
version of the RNG.

You need to check that the output of RNGkind() matches in all machines 
to know that they're using the same RNGs.

Duncan Murdoch

> Three other possibilities:
> 
> 1. Read news() for your local 4.0.2 installation, as there are some changes that were made, including some changes to round() that could be applicable here.
> 
> 2. Check to see if the version of glmnet is the same on both machines. There have been changes to that package that might be relevant here and you might read the README and NEWS files for the package on CRAN to see if there is any relevant information there.
> 
> 3. There is always a chance that different hardware and OS versions could lead to issues, especially out to a number of decimal places that could alter results. If you or via an Admin, have the ability to update the remote machine (both R and installed packages), that can help to reduce the number of variables at play here.
> 
> Regards,
> 
> Marc Schwartz
> 
> 
>> On Aug 7, 2020, at 4:24 PM, Kevin Egan <kevinegan31 using gmail.com> wrote:
>>
>> I posted this question:
>>
>> I am currently using R , RStudio , and a remote computer (using an R script) to run the same code. I start by using set.seed(123) in all three versions of the code, then using glmnet to assess a matrix. Ultimately, I am having trouble reproducing the results between my local and the remote computer's results. I am using R version 4.0.2 locally, and R version 3.6.0 remote.
>>
>> After running several tests, I'm wondering if there is a difference between the two versions in R which may lead to slightly different coefficients. If anyone has any insight I would appreciate it.
>>
>> Thanks.
>>
>> and found that there were slight differences between using rnorm with R-4.0.2 and R-3.6.0 but did not find any differences for runif between both systems. In my original code, I am using rnorm and was wondering if this may be the reason I am finding slight differences in coefficients for glmnet and lars testing between using my local computer (R-4.0.2) and my remote computer (R-3.6.0). I am running my code locally on a MacOSX and remote on what I believe is an HPC.
>>
>> Thanks.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list