4

I am working with scikit-learn's GaussianHMM and am getting the following ValueError when I try to fit it to some observations. here is code that demonstrates the error:

>>> from sklearn.hmm import GaussianHMM
>>> arr = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> arr
matrix([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
>>> gmm = GaussianHMM ()
>>> gmm.fit (arr)
/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/function_base.py:2005: RuntimeWarning: invalid value encountered in divide
  return (dot(X, X.T.conj()) / fact).squeeze()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/sklearn/hmm.py", line 427, in fit
    framelogprob = self._compute_log_likelihood(seq)
  File "/Library/Python/2.7/site-packages/sklearn/hmm.py", line 737, in _compute_log_likelihood
    obs, self._means_, self._covars_, self._covariance_type)
  File "/Library/Python/2.7/site-packages/sklearn/mixture/gmm.py", line 58, in log_multivariate_normal_density
    X, means, covars)
  File "/Library/Python/2.7/site-packages/sklearn/mixture/gmm.py", line 564, in _log_multivariate_normal_density_diag
    + np.dot(X ** 2, (1.0 / covars).T))
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/matrixlib/defmatrix.py", line 343, in __pow__
    return matrix_power(self, other)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/matrixlib/defmatrix.py", line 160, in matrix_power
    raise ValueError("input must be a square array")
ValueError: input must be a square array
>>> 

How might I remedy this? It seems that I am giving it valid inputs. Thanks!

alko
  • 46,136
  • 12
  • 94
  • 102
Jay Hack
  • 139
  • 3
  • 10
  • If i make arr such that it is not a square matrix, however, I still get the same error even if i encapsulate it in brackets... i.e. if arr is matrix([[ 1, 2, 3], [ 4, 5, 6], [ 7, 8, 9], [10, 11, 12]]). Any ideas? Thanks! – Jay Hack Dec 17 '13 at 03:57

2 Answers2

3

You have to fit with a list, see official examples:

>>> gmm.fit([arr])
GaussianHMM(algorithm='viterbi', covariance_type='diag', covars_prior=0.01,
      covars_weight=1,
      init_params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
      means_prior=None, means_weight=0, n_components=1, n_iter=10,
      params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
      random_state=None, startprob=None, startprob_prior=1.0, thresh=0.01,
      transmat=None, transmat_prior=1.0)
>>> gmm.n_features
3
>>> gmm.n_components
1
alko
  • 46,136
  • 12
  • 94
  • 102
  • The link is dead. See update here. http://scikit-learn.org/0.14/auto_examples/applications/plot_hmm_stock_analysis.html – Fanglin Aug 15 '14 at 23:13
3

According to the docs, gmm.fit(obs) expects obs to be a list of array-like objects:

obs : list
    List of array-like observation sequences (shape (n_i, n_features)).

Therefore, try:

import numpy as np
from sklearn.hmm import GaussianHMM
arr = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
gmm = GaussianHMM()
print(gmm.fit([arr]))

Hidden markov models (HMMs) are no longer supported by sklearn.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • so what I'm unclear on is what is represented in each column if we have to have as many columns as emission possibilities? Why can't we just pass a 1-dimensional sequence of emissions? – Brooks Jun 11 '15 at 20:21