[R] Making a markov transition matrix

Sun Jan 22 08:14:34 CET 2006

On Sun, Jan 22, 2006 at 01:47:00PM +1100, Bill.Venables at csiro.au wrote:
> If this is a real problem, here is a slightly tidier version of the
> function I gave on R-help:
> 
> transitionM <- function(name, year, state) {
>   raw <- data.frame(name = name, state = state)[order(name, year), ]
>   raw01 <- subset(data.frame(raw[-nrow(raw), ], raw[-1, ]), 
>                         name == name.1)
>   with(raw01, table(state, state.1))
> }
> 
> Notice that this does assume there are 'no gaps' in the time series
> within firms, but it does not require that each firm have responses for
> the same set of years.
> 
> Estimating the transition probability matrix when there are gaps within
> firms is a more interesting problem, both statistically and, when you
> figure that out, computationally.

With help from Gabor, here's my best effort. It should work even if
there are gaps in the timeseries within firms, and it allows different
firms to have responses in different years. It is wrapped up as a
function which eats a data frame. Somebody should put this function
into Hmisc or gtools or something of the sort.

# Problem statement:
#
# You are holding a dataset where firms are observed for a fixed
# (and small) set of years. The data is in "long" format - one
# record for one firm for one point in time. A state variable is
# observed (a factor).
# You wish to make a markov transition matrix about the time-series
# evolution of that state variable.

set.seed(1001)

# Raw data in long format --
raw <- data.frame(name=c("f1","f1","f1","f1","f2","f2","f2","f2"),
                  year=c(83,   84,  85,  86,  83,  84,  85,  86),
                  state=sample(1:3, 8, replace=TRUE)
                  )

transition.probabilities <- function(D, timevar="year",
                                     idvar="name", statevar="state") {
  merged <- merge(D, cbind(nextt=D[,timevar] + 1, D),
	by.x = c(timevar, idvar), by.y = c("nextt", idvar))
  t(table(merged[, grep(statevar, names(merged), value = TRUE)]))
}

transition.probabilities(raw, timevar="year", idvar="name", statevar="state")

-- 
Ajay Shah                                      http://www.mayin.org/ajayshah  
ajayshah at mayin.org                             http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.