[R] Random numbers negatively correlated?

Tue Jun 27 19:42:02 CEST 2006

Dear list,

I did simulations in which I generated 10000
independent Bernoulli(0.5)-sequences of length 100. I estimated
p for each sequence and I also estimated the conditional probability that 
a one is followed by another one (which should be p as well).
However, the second probability is significantly smaller than 0.5 (namely
about 0.494, see below) and of course smaller than the direct estimates of 
p as well, indicating negative correlation between the random numbers.

See below the code and the results.
Did I do something wrong or are the numbers in 
fact negatively correlated? (A type I error is quite unlikely with a
p-value below 2.2e-16.)

Best,
Christian

set.seed(123456)
n <- 100
p <- 0.5
simruns <- 10000
est <- est11 <- numeric(0)
for (i in 1:simruns){
#    if (i/100==round(i/100)) print(i)
     x <- rbinom(n,1,p)
     est[i] <- mean(x)
     x11 <- 3*x[2:n]-x[1:(n-1)]
     est11[i] <- sum(x11==2)/sum(x11==2 | x11==(-1))
     # x11==(-1): 0 follows 1, x11==2: 1 follows 1.
}

> print(mean(est))
[1] 0.499554
> print(sd(est)/sqrt(simruns))
[1] 0.0004958232
# OK

> print(mean(est11))
[1] 0.4935211
> print(sd(est11)/sqrt(simruns))
[1] 0.0007136213
# mean(est11)+2*sd(mean) < 0.495

> print(sum(est>est11))
[1] 5575
> binom.test(5575,10000)

 	Exact binomial test

data:  5575 and 10000
number of successes = 5575, number of trials = 10000, p-value <
2.2e-16

*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche