So I am currently training a Hidden Markov Model on a set of surgical data like so:
nltkTrainer = nltk.tag.hmm.HiddenMarkovModelTrainer(range(15),range(90))
model = nltkTrainer.train_unsupervised(data, max_iterations=3)
If it's helpful, 'model' is given as 'HiddenMarkovModelTagger 15 states and 90 output symbols'
However, it takes nearly an hour to run this full training on my machine. I want to be able to serialize the nltk model output 'model' to load and save between sessions. I've read around and everybody seems to use Python's built in pickle, which works all fine and dandy for known data types. I can even pickle my trained model variable using this code:
f = open('my_classifier.pickle', 'wb')
pickle.dump(model, f)
f.close()
But when trying to load the pickled file, I get the error:
/usr/local/lib/python2.7/dist-packages/nltk/probability.pyc in __init__(self, probdist_dict)
1971 """
1972 defaultdict.__init__(self, DictionaryProbDist)
-> 1973 self.update(probdist_dict)
1974
1975 ##//////////////////////////////////////////////////////
TypeError: 'type' object is not iterable
has anybody found a way around this? Is it an issue with NLTK?