1

How to generate the transition matrix and predict the next 2 Events using Markov Chain model ?

I have the data in the form shown below dt

v1<-c(1,1,1,1,1,2,2,2,3,3,3,3,3,3,3)
v2<-c("Jan","Jan","Jan","Feb","Feb","Jan","Jan","Feb","Jan","Jan","Feb","Feb","Feb","Feb","Feb")
v3<-c("A1","E1","F1","B1","A1","E1","B1","C1","B1","D1","E1","A1","B1","C1","F1")
dt <- data.table(emp_id=v1,month=v2,work=v3)

temp1 <- dt[,.(list(work)),.(emp_id,month)]
head(temp1)

enter image description here

temp2 <- temp1[,.(list(V1)),.(emp_id)]
head(temp2)

enter image description here

temp2[,V1 := lapply(V1, unlist, use.names = F)]
dt <- setnames(temp2,"V1","Events")

enter image description here

sid
  • 113
  • 1
  • 9

2 Answers2

2

There is an R package called markovchain described in this document that has a function to fit a Markov chain to a given sequence (function called markovchainFit) and one to calculate predictions from the trained Markov chain (called predict).

Edit: With respect to another proposed answer that I find inaccurate I add here some elements to my initial answer.

Typically, the Markov Chain estimation or Learning is done via a Maximum Likelihood (ML), a Maximum A Posteriori (MAP) approach or other techniques such as bootstrap. These methods can for example handle the case where some existing states of the Markov Chain are not reached by the available sequence(s) without giving them a 0 probability in the transition matrix. These are classical approaches for any Markovian model (Markov chain, hidden Markov model,...)

Eskapp
  • 3,419
  • 2
  • 22
  • 39
0

For a transition matrix, all you have to do is to create a square matrix of your states, go through all observed sequences, count the number of transitions and divide each row by the row sum. For a single sequence sequence, e.g.

  # Quadratic matrix of states
  states <- unique(sequence)
  mat <- matrix(data = 0,
                nrow = length(states), 
                ncol = length(states), 
                dimnames = list(states, states))

  # Calculate transition probabilities
  for(i in 2:length(sequence)) {
    mat[sequence[i - 1], sequence[i]] <- mat[sequence[i - 1], sequence[i]] + 1
  }
  mat <- mat /rowSums(mat)

The most likely next state is obviously the one with the highest value in the transition matrix row of your last state.

sebastianmm
  • 1,148
  • 1
  • 8
  • 26
  • 1
    I have to disagree with this. The technique you explain can be a good way of initializing the transition matrix for an iterative estimation algorithm but it is not a accurate way to fit a Markov chain to data. People would usually use a Maximum Likelihood (ML) approach or a Maximum A Posteriori (MAP). The two aforementioned methods can for example handle the case where one existing state of the Markov Chain is not reached by the available sequence(s) without giving it a 0 probability. – Eskapp Jul 25 '17 at 13:34