[R] simulating outcomes - categorical distribution (?)

Marc Schwartz marc_schwartz at comcast.net
Fri Jan 30 17:20:21 CET 2009


on 01/30/2009 09:46 AM Gonçalo Ferraz wrote:
> Hi,
> 
> I am simulating an event that has 15 possible outcomes and I have a
> vector 'pout' that gives me the probability of each outcome -
> different outcomes have different probabilities. Does anyone know a
> simple way of simulating the outcome of my event?
> 
> If my event had only two possible outcomes (0/1) I would pick a
> uniform random number between 0 and 1 and use it to choose between the
> two possible outcomes given a certain probability of outcome 0 or
> failure 1.
> 
> Now, since I have 15 possible outcomes, I guess that this should be
> done with some form of a categorical distribution but I couldn't find
> a function for the categorical distribution yet.
> 
> Thanks for any suggestions!
> 
> Gonçalo

See ?sample

  sample(OutcomeVector, NumberOfOutcomesToGenerate, prob = pout,
         replace = TRUE)

where OutcomeVector and pout contain a 1:1 pairing of each event to its
respective probability.

pout <- 1:15 / 100

> pout
 [1] 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13
[14] 0.14 0.15

# Set seed for reproducibility
set.seed(1)

# Generate 1000 replicates using the target probability distribution
Outcomes <- sample(1:15, 1000, prob = pout, replace = TRUE)


> table(Outcomes)
Outcomes
  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
 11  15  27  43  34  48  53  73  70  65 102 112 109 120 118


# Note that the random sample will "approach" the desired distribution
# in large samples, not be exact
> prop.table(table(Outcomes))
Outcomes
    1     2     3     4     5     6     7     8     9    10    11    12
0.011 0.015 0.027 0.043 0.034 0.048 0.053 0.073 0.070 0.065 0.102 0.112
   13    14    15
0.109 0.120 0.118


BTW, you could use sample() to do this for binary outcomes as well, or
use rbinom() to generate replicates from a binomial distribution.

Using:

  help.search("distribution")

will lead you to additional information on other distributions available.

HTH,

Marc Schwartz




More information about the R-help mailing list