[R] Offtopic, HT vs. HH in coin flips

Mon Aug 31 22:55:05 CEST 2009

It gets even more interesting when you ask about which
of 2 triples of head/tail sequences appears first in an 
infinite sequence of heads and tails.  Martin Gardiner
wrote about this in the early 1970's
 Martin Gardner, "Mathematical Games: The Paradox of the Nontransitive
Dice and the Elusive Principle of Indifference." Scientific American
223, 110-114, Dec. 1970
(and perhaps again in 1974).  His book, "The Colossal
Book of Mathematics: classic puzzles, paradoxes, and
problems" has that stuff reprinted and updated.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Erik Iverson
> Sent: Monday, August 31, 2009 1:35 PM
> To: Erik Iverson; r-help at r-project.org
> Subject: Re: [R] Offtopic, HT vs. HH in coin flips
> 
> Part of my issue was that I was not answering my original 
> question.  "What is more likely to show up first, HT or HH?" 
> The answer to that turns out to be "neither", or "identical chances". 
> 
> ht <- replicate(2500,
>                 paste(sample(c("H", "T"), 100, replace = TRUE),
>                       collapse = ""))
> 
> hts <- regexpr("HT", ht) + 1
> hhs <- regexpr("HH", ht) + 1
> 
> ## which is first?
> table(hts < hhs)  # about 50/50 
> 
> summary(hts)      #mean of 4
> summary(hhs)      #mean of 6
> 
> So, "What is more likely to show up first, HH or HT?" is of 
> course a different question than "Are the expected values of 
> the positions for the first HT or HH the same?"  I suppose 
> that's where confusion set in.  It seems that if HH appears 
> later in the string on average (i.e., after 6 tosses instead 
> of 4), that the probability of it being first would be lower 
> than HT, which is obviously wrong!
> 
> A quick graphic that helps show this (you must run the above 
> code first):
> 
> library(lattice)
> 
> ht.df <- data.frame(count = c(hts, hhs),
>                     type = gl(2, 1250, labels = c("HT", "HH")))
> 
> barchart(prop.table(xtabs(~ count + type, data = ht.df)),
>          stack = FALSE, horizontal = FALSE,
>          box.ratio = .8, auto.key = TRUE)
> 
> Thanks to all those who replied, and also someone sent me the 
> following link off list, it also clears up the confusion:
> 
> http://www.mit.edu/~emin/writings/coinGame.html
> 
> Best, 
> Erik 
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Erik Iverson
> Sent: Monday, August 31, 2009 2:17 PM
> To: r-help at r-project.org
> Subject: [R] Offtopic, HT vs. HH in coin flips
> 
> Dear R-help, 
> 
> Could someone please try to explain this paradox to me? What 
> is more likely to show up first in a string of coin tosses, 
> "Heads then Tails", or "Heads then Heads"?  
> 
> ##generate 2500 strings of random coin flips
> ht <- replicate(2500,
>                 paste(sample(c("H", "T"), 100, replace = TRUE),
>                       collapse = ""))
> 
> ## find first occurrence of HT
> mean(regexpr("HT", ht))+1    #mean of HT position, 4
> 
> ## find first occurrence of HH
> mean(regexpr("HH", ht))+1    #mean of HH position, 6
> 
> FYI, this is not homework, I have not been in school in 
> years.  I saw a similar problem posed in a blog post on the 
> Revolutions R blog, and although I believe the answer, I'm 
> having a hard time figuring out why this should be? 
> 
> Thanks,
> Erik Iverson
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>