[R] Efficient sampling from a discrete distribution in R

Issac Trotts issac.trotts at gmail.com
Tue Sep 4 05:48:00 CEST 2007


Hello r-help,

As far as I've seen, there is no function in R dedicated to sampling
from a discrete distribution with a specified mass function.  The
standard library doesn't come with anything called rdiscrete or rpmf,
and I can't find any such thing on the cheat sheet or in the
Probability Distributions chapter of _An Introduction to R_.  Googling
also didn't bring back anything.  So, here's my first attempt at a
solution.  I'm hoping someone here knows of a more efficient way.

# Sample from a discrete distribution with given probability mass function
rdiscrete = function(size, pmf) {
  stopifnot(length(pmf) > 1)
  cmf = cumsum(pmf)
  icmf = function(p) {
    min(which(p < cmf))
  }
  ps = runif(size)
  sapply(ps, icmf)
}

test.rdiscrete = function(N = 10000) {
  err.tol = 6.0 / sqrt(N)
  xs = rdiscrete(N, c(0.5, 0.5))
  err = abs(sum(xs == 1) / N - 0.5)
  stopifnot(err < err.tol)
  list(e = err, xs = xs)
}

Thanks,
Issac



More information about the R-help mailing list