I've been trying to get into hidden Markov models and the Viterbi algorithm recently. I found a library called hmmlearn (http://hmmlearn.readthedocs.io/en/latest/tutorial.html) to help me generate a state sequence for two states (with Gaussian emissions). Then I wanted to re-determine the state sequence using Viterbi. My code works, but predicts approximately 5% of the states wrong (depending on the means and variances of the Gaussian emissions). The hmmlearn library has a .predict method which also uses Viterbi to determine the state sequence.
My problem now is that the Viterbi algorithm by hmmlearn is much better than my hand-written one (error rate is lower than 0.5% compared to my 5%). I couldn't find any major problem in my code, so I'm not sure why this is the case. Below is my code where I first generate the state and observation sequence Z and X, predict Z with hmmlearn and finally predict it with my own code:
# Import libraries
import numpy as np
import scipy.stats as st
from hmmlearn import hmm
# Generate a sequence
model = hmm.GaussianHMM(n_components = 2, covariance_type = "spherical")
model.startprob_ = pi
model.transmat_ = A
model.means_ = obs_means
model.covars_ = obs_covars
X, Z = model.sample(T)
## Predict the states from generated observations with the hmmlearn library
Z_pred = model.predict(X)
# Predict the state sequence with Viterbi by hand
B = np.concatenate((st.norm(mean_1,var_1).pdf(X), st.norm(mean_2,var_2).pdf(X)), axis = 1)
delta = np.zeros(shape = (T, 2))
psi = np.zeros(shape= (T, 2))
### Calculate starting values
for s in np.arange(2):
delta[0, s] = np.log(pi[s]) + np.log(B[0, s])
psi = np.zeros((T, 2))
### Take everything in log space since values get very low as t -> T
for t in range(1,T):
for s_post in range(0, 2):
delta[t, s_post] = np.max([delta[t - 1, :] + np.log(A[:, s_post])], axis = 1) + np.log(B[t, s_post])
psi[t, s_post] = np.argmax([delta[t - 1, :] + np.log(A[:, s_post])], axis = 1)
### Backtrack
states = np.zeros(T, dtype=np.int32)
states[T-1] = np.argmax(delta[T-1])
for t in range(T-2, -1, -1):
states[t] = psi[t+1, states[t+1]]
I'm not sure if I have a big error in my code or if hmmlearn just uses a more refined Viterbi algorithm. I have noticed by looking into the falsely predicted states that the impact of the emission probability B seems to be too big as it causes the states to change too frequently even if the transition probability to go to the other state is really low.
I'm rather new to python so please excuse my ugly coding. Thanks in advance for any tips you might have!
Edit: As you can see in the code, I'm stupid and used variances instead of the standard deviation to determine the emission probabilities. After fixing this, I get the same result as the implemented Viterbi algorithm.