I checked out the link; Are you using the hmmlearn library, or did you design your own HMM?
Assuming that you correctly implemented an HMM, whether through you own design or with an external library such as hmmlearn, you should have received a sequence of predicted states after calling your predict method ( model.predict(observation_sequence) assuming that you used hmmlearn ). To determine the accuracy metric, you require a sequence of true states that corresponds to the predicted state sequence.
pred = [0, 1, 0, 0, 1, 1, 2, 0]
true = [0, 0, 1, 0, 0, 1, 2, 0]
count = 0
# Assuming that pred and true are of the same length
for x in range(len(pred)):
if pred[x] == true[x]:
count += 1
accuracy = count / len(pred)
Loop through the indexes of your prediction and the true states. Compare at each index the values of both sequences. If they are equivalent, add to a 'count' variable. After the loop is finished, divide this count by the length of pred (or true). This should give you the accuracy.
Other metrics such as sensitivity can be found in similar ways. For each state you can generate sensitivity as so:
sensitivity_dict = {state:0 for state in possible_states} # Where possible_states is a container or range of all possible states, i.e., [0, 1, 2]
for x in range(len(pred)):
if pred[x] == true[x]:
sensitivity_dict[pred[x]] += 1
for state in possible_states:
sensitivity_dict[state] = sensitivity_dict[state] / true.count(state) if true.count(state) != 0 else -1
Where a dictionary containing all of the possible states (assuming they are still encoded as digits) start with a value of 0. Loop through the predicted and true states and check if the values are equivalent to each other. If they are, pred[x] should return a state which can be used to access the counts of that state in the sensitivity dictionary. Increment if true.
Afterwards, loop through each state. Sensitivity is measured by:
TP / (TP + FN)
Where TP are the true positives and FN are the false negatives.
We can get TP + FN simply by counting the number of occurrences for each state in the true sequence of states. Divide the count held in the sensitivity dictionary by the the number of occurrences and you can get the sensitivity for each state, which may be a better metric to lead your model with. If the number of occurrences is 0, however, my code denotes sensitivity as -1, meaning that this state never occurred in the true sequence. You can set this to 0 or whatever you would like it to be as default.