2

Summary

I created a child class extending sklearn's NMF. Pickling it and unpickling an instance with dill fails if I don't re-define the class before.

Minimal (not) Working Example

First, create some custom classes and instances, and pickle them:

# imports
import dill
from sklearn.decomposition import NMF

# custom classes
class NMFwrapper(NMF): # custom child class
    def fit(self, X, y):
        print("Slightly modifying NMF")
        self = super().fit(X=X, y=y)
        return self

class CustomClass(): # dummy class
    def onlymethod(self):
        print("I'm the only method")

class CustomChild(CustomClass): # dummy child class
    def second_method(self):
        print("I'm a new child method")

# instances
normal_nmf = NMF()
new_nmf = NMFwrapper()
custom = CustomClass()
child = CustomChild()

# pickle all of them
with open("nmf.pkl", "wb") as f:
    dill.dump(normal_nmf, f)
with open("nmf_wrapper.pkl", "wb") as f:
    dill.dump(new_nmf, f)
with open("custom.pkl", "wb") as f:
    dill.dump(custom, f)
with open("child.pkl", "wb") as f:
    dill.dump(child, f)

Then in a second terminal/kernel/REPL/... do the following:

import dill
with open("nmf.pkl", "rb") as f:
    normal_nmf = dill.load(f) # works
with open("custom.pkl", "rb") as f:
    custom = dill.load(f) # works
with open("child.pkl", "rb") as f:
    child = dill.load(f) # works
with open("nmf_wrapper.pkl", "rb") as f:
    new_nmf = dill.load(f) # FAILS

Problem

The last load fails, with the following stacktrace:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-2327d8f76861> in <module>
      7     child = dill.load(f)
      8 with open("nmf_wrapper.pkl", "rb") as f:
----> 9     new_nmf = dill.load(f)

~/miniconda3/envs/test/lib/python3.9/site-packages/dill/_dill.py in load(file, ignore, **kwds)
    276 def load(file, ignore=None, **kwds):
    277     """unpickle an object from a file"""
--> 278     return Unpickler(file, ignore=ignore, **kwds).load()
    279 
    280 def loads(str, ignore=None, **kwds):

~/miniconda3/envs/test/lib/python3.9/site-packages/dill/_dill.py in load(self)
    479 
    480     def load(self): #NOTE: if settings change, need to update attributes
--> 481         obj = StockUnpickler.load(self)
    482         if type(obj).__module__ == getattr(_main_module, '__name__', '__main__'):
    483             if not self._ignore:

~/miniconda3/envs/test/lib/python3.9/site-packages/dill/_dill.py in find_class(self, module, name)
    469             return type(None) #XXX: special case: NoneType missing
    470         if module == 'dill.dill': module = 'dill._dill'
--> 471         return StockUnpickler.find_class(self, module, name)
    472 
    473     def __init__(self, *args, **kwds):

AttributeError: Can't get attribute 'NMFwrapper' on <module '__main__'>

Variation

Replacing dill with pickle is even worse: only the first load (normal NMF) works, all three others fail.

Re-definiing the custom Wrapper (for dill) and the custom class/child (for pickle) before unpickling works fine.

Question

Why can dill/pickle de-serialize an NMF instance, but have their respective troubles with custom classes/children classes? Being able to de-serialize a custom class and child of a custom class, but not child of a "normal" class is strange.

Battleman
  • 392
  • 2
  • 12
  • Do you still have this problem? With which Python, `dill` and `sklearn` versions? I couldn't reproduce your error. – leogama Aug 17 '22 at 01:04

0 Answers0