[R] Generate random numbers up to one

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Tue Mar 6 23:01:39 CET 2007


On 06-Mar-07 Alberto Monteiro wrote:
> Ted Harding wrote:
>> 
>> And, specifically (to take just 2 RVs X and Y), while U = X/(X+Y)
>> and V = Y/(A+Y) are two RVs which summ to 1, the distribution of U
>> is not the same as the distribution of X conditional on (X+Y = 1).
>> 
> This question

Which question? There are (implicitly) two questions there!

> appeared in October 2006, and the answer

To the second question (X conditional on X+Y=1)

> was the Dirichlet distribution with parameters (1,1,1...1):
> 
> http://en.wikipedia.org/wiki/Dirichlet_distribution
> 
> It's the distribution of uniform U1, U2, ... Un with the
> restriction that U1 + U2 + ... + Un = 1.

Indeed, and the resulting (U1,U2,...,Un) is uniformly distributed
on the simplex U1+U2+...+Un=1. For n>2, however, the resulting
marginal distribution of (say) U1 conditional on (U1+U2+...+Un=1)
is no longer uniform (that only holds for n=2, as in my example).
For n=3 this is easy to see: P[U1 > u1] is the area of the triangular
simplex between its vertex at (1,0,0) and the line from (u1,1-u1,0)
to (u1,0,0), and this is equal to (1 - u1)^2, so the density of U1
is f(u1) = 2*(1-u1).  In general, the marginal density of U1
in the n-dimensional Dirichlet is (n-1)*(1-u1)^(n-2).

But the aim was to illustrate Petr Klasterecky's point that

  "sum(x) is a random variable as well and dividing by
   sum(x) does not preserve the original distribution
   data were generated from."

namely to show two ways of generating RVs distributed on
U1 + U2 + ... + Un = 1, starting from independent RVs, which
result on two different distributions, and to give an example
where dividing by sum(x) can be seen to "not preserve" the
distribution.

Indeed, I think there is sometimes a confusion between this
question and the really unrelated question: Given non-negative
numbers V1, V2, ..., Vn, how can we convert then to a probability
distribution? To which the answer is, of course, divide by their
sum.

With best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-Mar-07                                       Time: 22:01:34
------------------------------ XFMail ------------------------------



More information about the R-help mailing list