[R] Issues with convolve
Duncan Murdoch
murdoch at stats.uwo.ca
Wed Jul 20 19:09:52 CEST 2005
On 7/20/2005 12:52 PM, Vincent Goulet wrote:
> We obtained some disturbing results from convolve() (inaccuracies and negative
> probabilities). We'll try to make the context clear in as few lines as
> possible...
>
> Our function panjer() (code below) basically computes recursively the
> probability mass function of a compound Poisson distribution. When the
> Poisson parameter lambda is very large, the starting value of the recursive
> scheme --- the mass at 0 --- is 0 and the recursion fails. One way to solve
> this problem is to divide lambda by 2^n, apply the panjer() function and then
> convolve the result with itself n times.
>
> We applied the panjer() function with a lambda such that the mass at 0 is just
> larger than .Machine$double.xmin. We thus know that once this pmf is
> convoluted with itself, the first probabilities will be 0 (for the computer).
>
> Here are the two issues we have with convolve():
>
> 1. The probabilities we know should be 0 are rather in the vicinity of 1E-19,
> as if convolve() could not "go lower". Using a hand made convolution function
> (not given here), we obtained the correct values. When probabilities get
> around 1E-12, results from convolve() and our home made function are
> essentially identical.
>
> 2. We obtained negative probabilities. More accurately, the same example
> returns negative probabilities under Windows, but not under Linux. We also
> obtained negative probabilities for another example under Linux, though.
>
> We understand that convolve() computes the convolutions using fft(), but we
> are not familiar enough with the latter to assess if the above issues are
> some sort of bugs or normal behavior. In the latter case, is there is any
> workaround?
Rounding will depend on the hardware and the compiler, so this might be
normal behaviour. It's a little disturbing, but you shouldn't expect
calculations to be accurate to more digits than your platform supports.
Convolving using an FFT is essentially rotating the vectors in a high
dimensional space, multiplying terms, and then rotating back. Unless
those rotations are unrealistically accurate very small numbers won't
necessarily show up as zeros.
I'd suggest re-thinking your panjer function to work in log
probabilities instead of probabilities, so that you can handle a larger
dynamic range before you run into underflow problems, and you don't need
to use convolve at all. Convert them back to probabilities at the very
end.
Duncan Murdoch
More information about the R-help
mailing list