[R] Quantiles of sums of independent discrete random variables
    Michael Meyer 
    @pyqqqd|@ @end|ng |rom y@hoo@com
       
    Tue Jan 23 16:53:50 CET 2024
    
    
  
Greetings,
I have the following 
Problem:
 Given k (=10) discrete independent random variables X_i with n_i (= 5 to 20) values each,compute quantiles of the distribution of the sum X = X_1+...+X_k.
Here X has n=n_1 x n_2 ... n_k distinct values which is too large to list them all together with
their probabilities.
I tried several approaches: 
(A) Convolution: 
each X_j is approximated with Y_j=X_j+Z, where Z is
an N(0,sigma) variable with small sigma. Then Y_j is a probability mixture of the normal
variables N(x_j,sigma), where the x_j runs over all values of X_j, and has a highly oscillatory density.
The density of Y=\sum Y_j is the convolution of the densities of the Y_j.
I need this density at a sizeable number of points and this turns out to be too slow.
The issue seems to be the convergence of the convolution integrals slowed down by the oscillatory nature
of the densities of the Y_j. When the densities are behaved better (e.g. normal RVs), the computation 
of such a convolution is quite fast. 
(B) Characteristic function: 
X will be approximated with Y=X+Z, where Z is normal N(0,sigma) with small sigma.
Y has a density (which it is impossible to compute directly) but the characteristic function 
(_continuous_ Fourier transform) cf_Y of Y can easily be computed analytically (without knowing
the density of Y)
Now let s be a numeric vector. I want to get the density f_Y(s) of Y evaluated along s.
The proper way of doing this would be to apply the inverse continuous Fourier transform to the function cf_Y at each point in s. 
This is far too slow. That's why I tried to apply the inverse _discrete_ Fourier transform to the vector of values cf_Y(s) 
and that does not yield anything reasonable.
This baffles me since I was under the impression that the discrete Fourier transform is an approximation to the continuous 
Fourier transform and so should yield the values of the latter, if the value grid s is fine enough.
Should this work? 
Note that this inversion would definitely work if I could compute the _discrete_ Fourier transform of the density f_Y
along s, but regrettably this is not possible since 
(a) the density of f_Y is far too complicated, and
(b) the discrete Fourier transform of a sum Y = Y_1+Y_2+...+Y_k of independent 
random variables Y_j is not the product of the discrete Fourier transforms of the Y_j.
Any ideas how I could approach this problem with the tools of R?
Thanks in advance for all replies,
Michael Meyer
	[[alternative HTML version deleted]]
    
    
More information about the R-help
mailing list