[Rd] Problem in random number generation for Marsaglia-Multicarry + Kinderman-Ramage

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Thu Aug 12 15:07:52 CEST 2021


With these matters, one has to be careful to distinguish between method error and implementation error. 

The reason for changing the RNG setup in R v. 1.7.0 was pretty much this kind of unfortunate interaction between M-M and K-R. There are even more egregious examples for the distribution of maxima of normal variables. Try e.g.

RNGversion("1.6.0") # Marsaglia-Multicarry, Kinderman-Ramage
 s <- replicate(1e6,max(rnorm(10)))
 plot(density(s))

(A further bug in K-R was fixed in 1.7.1, but that is tangential to this.)

A glimpse of the source of the problem is seen in the "microcorrelations" in this:
 
RNGkind("Mar");m <- matrix(runif(4e7),2)
 plot(m[1,],m[2,],xlim=c(0,1e-3),pch=".")
 m <- matrix(runif(4e7),2)
 points(m[1,],m[2,],pch=".")

These examples are from 2003, so the issue has been known for almost 2 decades. However, to the best of our knowledge, the M-M RNG is a faithful implementation of their method, so we have left the RNG in R's arsenal, in case someone needed it for some specific purpose. 

- pd

> On 12 Aug 2021, at 11:51 , GILLIBERT, Andre <Andre.Gillibert using chu-rouen.fr> wrote:
> 
> Dear R developers,
> 
> 
> In my opinion, I discovered a severe flaw that occur with the combination of the Marsaglia-Multicarry pseudo-random number generator associated to the Kinderman-Ramage algorithm to generate normally distributed numbers.
> 
> 
> The sample program is very simple (tested on R-4.1.1 x86_64 on Windows 10):
> 
> set.seed(1, "Marsaglia-Multicarry", normal.kind="Kinderman-Ramage")
> v=rnorm(1e7)
> poisson.test(sum(v < (-4)))$conf.int # returns c(34.5, 62.5)
> poisson.test(sum(v > (4)))$conf.int # returns c(334.2, 410.7)
> pnorm(-4)*1e7 # returns 316.7
> 
> 
> There should be approximatively 316 values less than -4 and 316 values greater than +4, bug there are far too few values less than -4.
> 
> Results are similar with other random seeds, and things are even more obvious with larger sample sizes.
> 
> The Kinderman-Ramage algorithm is fine when combined to Mersenne-Twister, and Marsaglia-Multicarry is fine when combined with the normal.kind="Inversion" algorithm, but the combination of Marsaglia-Multicarry and Kinderman-Ramage seems to have severe flaws.
> 
> R should at least warn for that combination !
> 
> What do you think? Should I file a bug report?
> 
> --
> Sincerely
> Andr� GILLIBERT
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-devel mailing list