[Rd] Bug in sample()
kellieotto at berkeley.edu
Tue Mar 7 20:06:48 CET 2017
Philip Stark and I think we have found a problem with how R generates
random samples, resulting from how it generates random integers between 1
and n. (If we are reading the code correctly, the method is to multiply a
pseudo-random binary fraction by n, take the floor, and add 1; this suffers
from quantization effects that can get quite large when n is just below
A better method, used in Python, is to generate ceil(log_2(n))
pseudo-random bits, add 1, and discard values bigger than n.
Attached is a short document explaining the issue in more detail.
Ph.D. Statistics '19, University of California, Berkeley
Fellow at Berkeley Institute for Data Science
Mobile: (650) 520-5056
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 230126 bytes
Desc: not available
More information about the R-devel