[Rd] Bias in R's random integers?
murdoch@dunc@n @ending from gm@il@com
Wed Sep 19 14:43:26 CEST 2018
On 18/09/2018 5:46 PM, Carl Boettiger wrote:
> Dear list,
> It looks to me that R samples random integers using an intuitive but biased
> algorithm by going from a random number on [0,1) from the PRNG to a random
> integer, e.g.
> Many other languages use various rejection sampling approaches which
> provide an unbiased method for sampling, such as in Go, python, and others
> described here: https://arxiv.org/abs/1805.10941 (I believe the biased
> algorithm currently used in R is also described there). I'm not an expert
> in this area, but does it make sense for the R to adopt one of the unbiased
> random sample algorithms outlined there and used in other languages? Would
> a patch providing such an algorithm be welcome? What concerns would need to
> be addressed first?
> I believe this issue was also raised by Killie & Philip in
> http://r.789695.n4.nabble.com/Bug-in-sample-td4729483.html, and more
> recently in
> pointing to the python implementation for comparison:
I think the analyses are correct, but I doubt if a change to the default
is likely to be accepted as it would make it more difficult to reproduce
On the other hand, a contribution of a new function like sample() but
not suffering from the bias would be good. The normal way to make such
a contribution is in a user contributed package.
By the way, R code illustrating the bias is probably not very hard to
put together. I believe the bias manifests itself in sample() producing
values with two different probabilities (instead of all equal
probabilities). Those may differ by as much as one part in 2^32. It's
very difficult to detect a probability difference that small, but if you
define the partition of values into the high probability values vs the
low probability values, you can probably detect the difference in a
More information about the R-devel