2

I've trained a SVM classifier using NLTK and svmlight python libraries and when I call pickle.dump(my_classifier, outfile, 1) to save my classifier, it throws this error:

File "/usr/lib/python2.7/pickle.py", line 313, in save
    (t.__name__, obj))
    pickle.PicklingError: Can't pickle 'PyCObject' object: <PyCObject object at 0xc1cbd50>

I read that we can't pickle a CObject, I didn't find a solution to save my work though :/

How could I proceed? I use python 2.7.3

For what it's worth, for those who know NLTK, everything works fine when I pickle other classifiers like MaxentClassifier or NaiveBayesClassifier opposed to SvmClassifier, I think it has something to do with svmlight library, but it's the first time I use it.

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
rafa
  • 795
  • 1
  • 8
  • 25

1 Answers1

1

You can use the method write_model(model, filename) from the svmlight library to save it. Maybe you can teach pickle to use that as a custom protocol for pickling.

Hans Then
  • 10,935
  • 3
  • 32
  • 51
  • By teaching pickle a custom protocol you mean what? modifying pickle.py? – rafa Jul 31 '13 at 12:29
  • Hans is referring to the standard pickle extension points in http://docs.python.org/2/library/copy_reg.html which is specifically for this task. – SingleNegationElimination Jul 31 '13 at 12:36
  • I meant maybe pickle can be configured to use custom picklers for a specific data type. However, it turns out that if you add a `__get_state__()` and `__set_state__()` method to your (subclass of the) model class it becomes pickleable. See: http://docs.python.org/2/library/pickle.html#object.__getstate__ – Hans Then Jul 31 '13 at 12:40
  • @TokenMacGuy yes, that was what I meant. Thanks for the link. Apparently there is more than one way to do it. – Hans Then Jul 31 '13 at 12:42
  • I'm sorry, i'm kinda new to pickling things, all I know until now is `dump` and `load`. I read these links and I have no idea how to implement these methods. I don't even see this `PyCObject` because apparently it's called in `svmlight` module, which is called by `SvmClassifier`, which is called from my script. What's the best way to fix this without getting out of my script (if possible)? – rafa Jul 31 '13 at 13:02
  • Oh, and one more thing, when I tried `write_model` from svmlight, I got `Segmentation fault`... – rafa Jul 31 '13 at 13:05