[Rd] uniform sampling without replacement algorithm

Pavel S. Ruzankin ruzankin at math.nsc.ru
Tue Oct 17 19:55:01 CEST 2017


Let us consider the current uniform sampling without replacement 
algorithm. It resides in function do_sample in
https://svn.r-project.org/R/trunk/src/main/random.c
Its complexity is obviously O(n), where the sample is selected from 
1...n, since the algorithm has to create a vector of length n. So when 
the sample size is much lesser than n, the algorithm is not effective. 
Algorithms with average complexity O(s log s), were s is the sample 
size, were described long ago. E.g. see
https://www.degruyter.com/view/j/mcma.1999.5.issue-1/mcma.1999.5.1.39/mcma.1999.5.1.39.xml
Here the Tree algorithm has complexity O(s log s). I suppose that there 
may be algorithms with complexity close to s. Is somebody planning to 
implement some more effective algorithm?



More information about the R-devel mailing list