[Rd] A bug in the R Mersenne Twister (RNG) code?

Dirk Eddelbuettel edd at debian.org
Wed Aug 31 17:30:07 CEST 2016


On 30 August 2016 at 18:29, Duncan Murdoch wrote:
| I don't see evidence of a bug.  There have been several versions of the 
| MT; we may be using a different version than you are.  Ours is the 
| 1999/10/28 version; the web page you cite uses one from 2002.
| 
| Perhaps the newer version fixes some problems, and then it would be 
| worth considering a change.  But changing the default RNG definitely 
| introduces problems in reproducibility, so it's not obvious that we 
| would do it.

Yep. FWIW the GNU GSL adopted the 2002 version a while ago too. Quoting from
https://www.gnu.org/software/gsl/manual/html_node/Random-number-generator-algorithms.html

Generator: gsl_rng_mt19937

   The MT19937 generator of Makoto Matsumoto and Takuji Nishimura is a
   variant of the twisted generalized feedback shift-register algorithm, and
   is known as the “Mersenne Twister” generator. It has a Mersenne prime
   period of 2^19937 - 1 (about 10^6000) and is equi-distributed in 623
   dimensions. It has passed the DIEHARD statistical tests. It uses 624 words
   of state per generator and is comparable in speed to the other
   generators. The original generator used a default seed of 4357 and
   choosing s equal to zero in gsl_rng_set reproduces this. Later versions
   switched to 5489 as the default seed, you can choose this explicitly via
   gsl_rng_set instead if you require it.

   For more information see,

      Makoto Matsumoto and Takuji Nishimura, “Mersenne Twister: A
      623-dimensionally equidistributed uniform pseudorandom number
      generator”. ACM Transactions on Modeling and Computer Simulation,
      Vol. 8, No. 1 (Jan. 1998), Pages 3–30 The generator gsl_rng_mt19937
      uses the second revision of the seeding procedure published by the two
      authors above in 2002. The original seeding procedures could cause
      spurious artifacts for some seed values. They are still available
      through the alternative generators gsl_rng_mt19937_1999 and
      gsl_rng_mt19937_1998.

Note the last sentence here.

This is all somewhat technical code, so when I noticed the above I could
never figure what exactly R was doing in its implementation.  But "innocent
until proven guilty" -- a sufficient number of people ought to have looked at
this -- so I saw no need to pursue this further.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org



More information about the R-devel mailing list