I am trying to fit and train a HMM using Hmmlearn, however I get this weird warning that I don't fully understand:
Fitting a model with 117917879 free scalar parameters with only 550034 data points will result in a degenerate solution.
I use quite a large dataset, but I don't understand where the 117917879 free scalar parameters come from, and what it means to have a degenerate solution.
I define my hmm as follows:
from hmmlearn import hmm
# vocab_size = 10858, is the number of states
model = hmm.GaussianHMM(n_components=vocab_size, covariance_type="full")
# frequency_list = list of length 1058, containing the initial probability of each state
model.start_prob_ = np.array(frequency_list)
# transitions is a (10858, 10858) containing the transition probabilities
model.transmat_ = np.array(transitions)
# integer_array = My data converted to an array (size = 550034)
integer_array = integer_array.reshape(-1,1)
model.fit(integer_array)
Could anyone help me improve, or at least explain where the scalar parameters come from, and what a degenerate solution is?