Bug in rnorm. (PR#1664)

rolf@math.unb.ca rolf@math.unb.ca
Thu, 13 Jun 2002 21:36:42 +0200 (MET DST)

There appears to be a mild bug, or at least a deficiency, in
rnorm.  The bug becomes apparent when one looks at extremes
of the squares of the values generated by rnorm; rnorm is not
generating quite enough extreme values.

The R version that I am using is 1.4.1; I never got around to installing
1.5.0, and now since 1.5.1 is about to come out ....  However, checking
the 1.5.0 release notes revealed no mention of fixing a bug in rnorm.

Version details:

platform sparc-sun-solaris2.7
arch     sparc               
os       solaris2.7          
system   sparc, solaris2.7   
major    1                   
minor    4.1                 
year     2002                
month    01                  
day      30                  
language R

More detail regarding how the bug revealed itself:

	For n = 100, 200, ..., 1500

	o I generated 1000 sequences of length n via rnorm(n),

	o for each sequence x, I calculated m = the max of x^2

	o I then calculated pval = 1 - pchisq(m,1)^n

	o I then calculated s.hat.n = #{pval: pval < 0.05}/1000

I then plotted s.hat.n versus n.  This ***should*** give a result
close to a horizontal straight line, at height 0.05 --- but it
didn't.  For the larger values of n, the values of s.hat.n were
displaced significantly below 0.05.

After some discussion with colleagues, I replaced the calls to rnorm()
by calls to myrnorm() defined by

	myrnorm <- function(n,mu=0,sigma=1){
		mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n)))

which uses the ``(r,theta)'' method of generating random normals.

When I did so, the resulting values were indeed all ``close to'' 0.05,
as they should be.

I also tried the experiment using rchisq(n,1) instead of rnorm(n) (and
then of course taking m = max of x --- rather than max of x^2).  Again
all the resulting values were close to 0.05 as ought to be the case.
(So rchisq() appears to be OK in this regard.)

Enclosed below is a script to demonstrate the bug.


						Rolf Turner

# Script to demonstrate the bug in rnorm.
myrnorm <- function(n,mu=0,sigma=1){
	mu + sigma*cos(2*pi*runif(n))*sqrt(-2*log(runif(n)))

# If RFUN <- rnorm we get ``wrong'' answers; if RFUN <- myrnorm,
# we get ``right'' answers.
RFUN <- rnorm

NSER <- 1000
rslt <- list()

for(K in 1:15) {
	N   <- 100*K
	M   <- matrix(RFUN(NSER*N),N,NSER)
	T2  <- apply(M,2,function(x){max(x**2)})
	PV  <- 1 - pchisq(T2,1)**N
	SZ  <- sum(PV < 0.05)/NSER
	rslt[[K]] <- SZ

rslt <- unlist(rslt)

# Clean up:

r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch