[Rd] efficiency of sample() with prob.
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Jun 22 22:35:03 CEST 2005
You might recall the message
R is a collaborative project with many contributors.
We suggest that you take up your own suggestion, research this area and
offer the R project the results of your research for consideration as your
It seems likely that in sample(x, size, replace = TRUE, prob) the optimal
method depends on both the size of 'x' and 'size' and perhaps to a lesser
extent on 'prob'. (That's what my book on the subject shows.)
On Wed, 22 Jun 2005, Bo Peng wrote:
> On 6/21/05, Vadim Ogranovich <vograno at evafunds.com> wrote:
>> In his "Introduction to Probability Models" Sheldon Ross describes (sec
>> 11.4.1, 8th edition) the alias method for such weighted sampling.
>> It is based on some decomposition of the original distribution (the
>> weights) into a mixture of two-point distributions.
> This sounds like Walker's alias method for weighted sampling. I looked
> through Knoth's 'the art of computer programming' and find this
> algorithm. I implemented this one but it is just as efficient as the
> bisection lookup method in my case. The reason is that the setup of
> this algorithm is complicated so it is suited for getting large sample
> from short weighted sequences. Anyway, I do suggest R developers try
> this algorithm for sample with replacement. A sample code can be found
> at http://statistik.wu-wien.ac.at/arvag/monograph/arvag-src/algo03_03.c
> BTW, does anyone know a quicker algorithm to set up the internal table
> of the alias method?
Quicker than what? See the discussion in my Stochastic Simulation book
for `quicker than Walker'.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel