[Rd] Redundant source code for random number generation

GILLIBERT, Andre Andre@G||||bert @end|ng |rom chu-rouen@|r
Sat Aug 7 12:40:24 CEST 2021


Dear R developers,


When trying to fix poor performances of the runif() function (I can easily make it three times faster if you are interested in performance patches, maybe six times faster with a bit of refactoring of R source code), I noticed some redundant code in R source code (R-devel of 2021-08-05).

Indeed, the family of random number generation functions (runif, rnorm, rchisq, rbeta, rbinom, etc.) is implemented via Internal functions described in src/main/names.c and implemented as do_random1, do_random2 and do_random3 in src/main/random.c.


They are also reimplemented in src/library/stats/src/random.c in three main functions (random1, random2, random3) that will eventually be stored in a dynamic library (stats.so or stats.dll).


For instance, the stats::runif R function is implemented as:

function (n, min = 0, max = 1)
.Call(C_runif, n, min, max)


but could equivalently be implemented as:

function(n, min = 0, max = 1)

.Internal(runif(n, min, max))


The former calls the src/library/stats/src/random.c implementation (in stats.so or stats.dll) while the latter would call the src/main/random.c implementation (in the main R binary).


The two implementations (src/main/random.c and src/library/stats/src/random.c) are similar but slightly different on small details. For instance, rbinom always return a vector of doubles (REAL) in src/main/random.c while it tries to return a vector of integers in src/library/stats/src/random.c, unless the integers are too large to fit in an INT.


I see no obvious reason of maintaining both source codes. Actually the src/main/random.c seems to be unused in normal R programs. There could be some weird programs that use the .Internal call, but I do not think that there are many.


There are several strategies to merge both, but I want some feedback of people who know well the R source code before proposing patches.


--

Sincerely

Andr� GILLIBERT

	[[alternative HTML version deleted]]



More information about the R-devel mailing list