[R] Random number quality

Thomas Lumley tlumley at u.washington.edu
Sat Feb 6 23:28:41 CET 2010


On Sat, 6 Feb 2010, Patrick Burns wrote:

> A couple comments.
>
> Although pseudo-random numbers were originally
> used because of necessity rather than choice,
> there is a definite upside to using them.  That
> upside is that the computations become reproducible
> if you set the seed first (see 'set.seed').
>
> I tend to encourage skepticism at pretty much
> every turn.  But I find this piece of skepticism
> a bit misplaced.  The application that you describe
> does not sound at all demanding, and R Core is
> populated by some of the best statistical computing
> people in the world.


It depends on the purpose that the random numbers are needed for.  For statistical simulation the default generators are good, and if you want to be even more sure you can run a simulation again with a different generator.

There are some purposes for which the generators are inadequate

1) they are not cryptographically secure: it is feasible to work out the random seed and hence the future sequence by observing enough of the output.  They cannot be used to generate numbers that must be unpredictable to an intelligent adversary. For many applications like this you wouldn't want to use numbers from random.org either -- they are sent over the public networks, after all.


2) they may not be random enough for some number-theoretic algorithms.  For example, there is an efficient algorithm for finding prime numbers based on random choices, but no efficient deterministic algorithm is known and it is an open question whether an efficient deterministic algorithm even exists.  It is possible that simple random number generators could give substantially worse performance in random algorithms of this sort, though the limited empirical evidence I am aware of is in the other direction.


               -thomas

> On 05/02/2010 22:04, b k wrote:
>> Hello,
>> 
>> I'm running R 2.10.1 on Windows Vista. I'm selecting a random sample of
>> several hundred items out of a larger population of several thousand. I
>> realize there is srswor() in package sampling for exactly this purpose, but
>> as far as I can tell it uses the native PRNG which may or may not be random
>> enough. Instead I used the random package which pulls random numbers from
>> random.org, although in my extended reading  [vignette("random-intro",
>> package="random")] it seem like that may have problems also.
>> 
>> I'm curious what the general consensus is for random number quality for 
>> both
>> the native built-in PRNG and any alternatives including the random package.
>> 
>> Thanks,
>> Ben K.
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>
> -- 
> Patrick Burns
> pburns at pburns.seanet.com
> http://www.burns-stat.com
> (home of 'The R Inferno' and 'A Guide for the Unwilling S User')
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list