[Rd] efficiency of sample() with prob.

Vadim Ogranovich vograno at evafunds.com
Tue Jun 21 19:53:22 CEST 2005


In his "Introduction to Probability Models" Sheldon Ross describes (sec
11.4.1, 8th edition) the alias method for such weighted sampling.
It is based on some decomposition of the original distribution (the
weights) into a mixture of two-point distributions. I don't know the
run-time complexity of the decomposition step. But once the
decomposition is found the complexity of generating a sample is linear
in the size of the sample.

I haven't coded this because I had found an easier way to deal with the
original problem I had at that time.

HTH,
Vadim

> -----Original Message-----
> From: r-devel-bounces at r-project.org 
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Bo Peng
> Sent: Tuesday, June 21, 2005 9:24 AM
> To: r-devel at r-project.org
> Subject: [Rd] efficiency of sample() with prob.
> 
> Dear list,
> 
> A while ago, Vadim asked opinions on improving efficiency of 
> sample() with prob, e.g. sample with replacement with weight. 
>  ( 
> https://stat.ethz.ch/pipermail/r-devel/2004-September/030844.h
tml ) He did not post what he ended up with this problem though.
> 
> I am having exactly the same problem. I need to sample  with 
> replacement from a population of size 10 million with fitness 
> values for each individual. sample() is too slow for this purpose.
> 
> I implement a bisection search algorithm. It is about 30% 
> faster than the linear search one but still not good enough 
> to me. (I can post the function if needed). Does anybody have 
> some good ideas? The only other idea I have is using a faster 
> (but worse) random number generator just for this application.
> 
> Thanks.
> Bo
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list