[R] simulate binary markov chain

Charles C. Berry cberry at tajo.ucsd.edu
Wed Dec 17 08:01:45 CET 2008


On Wed, 17 Dec 2008, Chris Oldmeadow wrote:

> Hi all, I was hoping somebody may know of a function for simulating a large 
> binary sequence (length >10 million) using a (1st order) markov model with 
> known (2x2) transition matrix. It needs to be reasonably fast.

Chris,

The trick is to recognize that the length of each run is a sample from
the geometric distribution (with 1 added to it). rgeom() is vectorized,
so using it provides fast results.

Suppose that your transition matrix is

    |   | 0     | 1     |
    |---+-------+-------|
    | 0 | pi.11 | pi.12 |
    | 1 | pi.21 | pi.22 |
    |---+-------+-------|

where pi.11+p.12 == 1 and pi.21+pi.22 == 1

This function

  foo <- function(n,pi.12,pi.21) inverse.rle( list(values=rep(0:1,n) ,
 		lengths=1+rgeom( 2*n, rep( c( pi.12, pi.21 ), n) )))

will generate a sequence of 0/1's according to that matrix with length approximately n/pi.12+n/pi.21

On my macbook I get this timing:

> system.time(res <- foo(1205000,.3,.2))
    user  system elapsed
   1.088   0.204   1.291
> prop.table(table(head(res,-1),tail(res,-1)),1) # check!

             0         1
   0 0.6999024 0.3000976
   1 0.1997453 0.8002547
> length(res) # long enough!
[1] 10048040
>


So, if this is fast enough, you just choose 'n' to be a bit larger
than desired length divided by (1/pi.12+1/pi.21) and then discard the 
excess.

Chuck


I have tried 
> the following;
>
> mc<-function(sq,P){
>  s<-c()
>  x<-row.names(P)
>  n<-length(sq)
>  p1<-sum(sq)/n
>  s[1] <- rbinom(1,1,p1);
>  for ( i in 2:n){
>     s[i] <- rbinom( 1, 1, P[s[i-1]+1] )
>  }
>  return(s)
> }
>
>
> P<-c(0.63,0.27)
> x<-rbinom(500,1,0.5)
> new<-mc(x,P)
>
> thanks in advance!
> Chris
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list