Fitting data to hmm.MultinomialHMM

Question

I'm trying to predict the most optimal sequence given some data using the hmmlearn library, but I get an error. My code is:

from hmmlearn import hmm
trans_mat = np.array([[0.2,0.6,0.2],[0.4,0.0,0.6],[0.1,0.2,0.7]])
emm_mat = np.array([[0.2,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1],[0.1,0.1,0.1,0.1,0.2,0.1,0.1,0.1,0.1],[0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.2]])
start_prob = np.array([0.3,0.4,0.3])
X = [3,4,5,6,7]
model = GaussianHMM(n_components = 3, n_iter = 1000)
X = np.array(X)
model.startprob_ = start_prob
model.transmat_ = trans_mat
model.emissionprob_ = emm_mat

# Predict the optimal sequence of internal hidden state
x = model.fit([X])

print(model.decode([X]))

but I get an error saying:

Traceback (most recent call last):
  File "hmm_loyalty.py", line 55, in <module>
    x = model.fit([X])
  File "build/bdist.macosx-10.6-x86_64/egg/hmmlearn/base.py", line 421, in fit
  File "build/bdist.macosx-10.6-x86_64/egg/hmmlearn/hmm.py", line 183, in _init
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/k_means_.py", line 785, in fit
    X = self._check_fit_data(X)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/k_means_.py", line 758, in _check_fit_data
X.shape[0], self.n_clusters))
ValueError: n_samples=1 should be >= n_clusters=3

Anyone have any idea what this means and what I can do to resolve it?

Sergei Lebedev · Accepted Answer · 2016-03-09T15:56:25.430

11

There're a number of issues with your code:

model is a GaussianHMM. You probably wanted MultinomialHMM.
The input X has wrong shape. For MultinomialHMM X must have shape (n_samples, 1), since the observations are 1-D.
You don't want fit unless some of the model parameters need to be estimated, which is not the case here.

Here's a working version

import numpy as np
from hmmlearn import hmm

model = hmm.MultinomialHMM(n_components=3)
model.startprob_ = np.array([0.3, 0.4, 0.3])
model.transmat_ = np.array([[0.2, 0.6, 0.2],
                            [0.4, 0.0, 0.6],
                            [0.1, 0.2, 0.7]])
model.emissionprob_ = np.array([[0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
                                [0.1, 0.1, 0.1, 0.1, 0.2, 0.1, 0.1, 0.1, 0.1],
                                [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.2]])

# Predict the optimal sequence of internal hidden state
X = np.atleast_2d([3, 4, 5, 6, 7]).T
print(model.decode(X))

edited Mar 09 '16 at 15:56

answered Mar 06 '16 at 22:37

Sergei Lebedev

2,659
20
23

Just as a quick follow-up, why do we take the transpose of X? – lordingtar Mar 09 '16 at 22:27
1

Because after `np.atleast_2d` the shape of `X` is `(1, n_samples)`. – Sergei Lebedev Mar 10 '16 at 09:11
Assuming I hadn't set the model parameters, how would I call the fit function? It says I require (n_samples, 1), but the above shape of X doesn't work for me. It still says ValueError: Expected sample from Multinomial Distribution – lordingtar Mar 17 '16 at 20:06
The error might be caused by `X` not being contiguous. Ensure that `X` contains all values from the range `[X.min(); X.max()]`. – Sergei Lebedev Mar 17 '16 at 20:35
after `np.atleast_2d` the shape of X is `(5, 1)`, **where 5 is X and 1 is lengths**. So can i understood in that way ??? (`model.fit(X, lengths)` as per document) @SergeiLebedev – Mari Jun 08 '19 at 19:55

Fitting data to hmm.MultinomialHMM

1 Answers1