Pickling a trained NLTK Model

Question

So I am currently training a Hidden Markov Model on a set of surgical data like so:

nltkTrainer = nltk.tag.hmm.HiddenMarkovModelTrainer(range(15),range(90))
model = nltkTrainer.train_unsupervised(data, max_iterations=3)

If it's helpful, 'model' is given as 'HiddenMarkovModelTagger 15 states and 90 output symbols'

However, it takes nearly an hour to run this full training on my machine. I want to be able to serialize the nltk model output 'model' to load and save between sessions. I've read around and everybody seems to use Python's built in pickle, which works all fine and dandy for known data types. I can even pickle my trained model variable using this code:

f = open('my_classifier.pickle', 'wb')
pickle.dump(model, f)
f.close()

But when trying to load the pickled file, I get the error:

/usr/local/lib/python2.7/dist-packages/nltk/probability.pyc in __init__(self, probdist_dict)
   1971         """
   1972         defaultdict.__init__(self, DictionaryProbDist)
-> 1973         self.update(probdist_dict)
   1974 
   1975 ##//////////////////////////////////////////////////////

TypeError: 'type' object is not iterable

has anybody found a way around this? Is it an issue with NLTK?

Well, working now...reinstalled NLTK from the source egg and problem went away. Don't have the faintest idea what the problem was, other than perhaps this: github.com/nltk/nltk/issues/293 — tyleha, Oct 22 '12 at 23:25
have you tried `import cPickle as pickle` and `pickle.dump(model, fout, protocol=pickle.HIGHEST_PROTOCOL)`? — alvas, Nov 07 '13 at 18:39
Figured it out in the end by getting in touch with the NLTK developers - there was a defaultdict somewhere in a class init that has it's own, annoying method of being pickled. That defaultdict was not being handled. This issue should be fixed in the latest dev version of NLTK. — tyleha, Nov 12 '13 at 20:43
for documentation, can you put a link to the email/nltk-user mailing list archive? — alvas, Nov 13 '13 at 10:56

Pickling a trained NLTK Model

0 Answers0

Linked