1

I saved a bunch of stuff in one single pickle file with dill and some of the data are objects. Unfortunately, I didn't realize that if I changed my code structure e.g., heavy refactoring that dill wouldn't be able to find my code anymore for loading the objects - or at least that is my guess from the print statement:

------- Main Resume from Checkpoint  --------
Traceback (most recent call last):
  File "/Users/brandomiranda/opt/anaconda3/envs/meta_learning/lib/python3.9/site-packages/torch/serialization.py", line 607, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/Users/brandomiranda/opt/anaconda3/envs/meta_learning/lib/python3.9/site-packages/torch/serialization.py", line 882, in _load
    result = unpickler.load()
  File "/Users/brandomiranda/opt/anaconda3/envs/meta_learning/lib/python3.9/site-packages/torch/serialization.py", line 875, in find_class
    return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'meta_learning'
python-BaseException

however, it's ok if it cannot find them. There is other data there that I still need or can use to restore everything.

How can I "force" dill to open things even if it cannot find the code? Or I can point to it to the paths for the locations of the new code if that is what it needs.

Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
  • `Unpickler` uses this settings obj that you can play around with and overwrite: https://github.com/uqfoundation/dill/blob/master/dill/settings.py Otherwise you may have to step through each exception and reverse engineer the pieces you need to get the content loaded – Bugbeeb Dec 14 '21 at 01:28
  • I'm the `dill` author. For some objects, `dill` can pickle without reference, however for others it prefers a reference. So, for a class defined in `__main__`, it will store everything including the definition -- but for a class defined in an installed module, it will store a reference to the class definition in the installed module. Generally, for anything that was in an installed module (i.e. in `site-packages`), you'll need a reference definition. It's hard to say what to try without knowing what type of objects your data is, or how it was stored. – Mike McKerns Dec 14 '21 at 14:54
  • @MikeMcKerns I install all the work I am developing with `pip install -e .`. For those objects, I am creating PyTorch models that I have classes defined for them. From the error dill saved a reference. So despite it being a model/class I made it stored the path. Does that happen if the class in question references other parts of an installed library with regular pip (not editbale) e.g. installing regular pytorch. – Charlie Parker Dec 14 '21 at 16:40
  • Currently, my plan was to restore my old code before the heavy refactoring, create a new branch, recover the checkpoints/pickle files and save only the data without references to objects. I can't see what else I could do as you said. – Charlie Parker Dec 14 '21 at 16:41
  • Yeah, `dill` (and any serializer) looks at what state is required to store the target object, and then recursively pickles each element of the state (that it will later plug into a function that creates a "new copy" of the target object). If any bit of state is defined in an installed module, then that bit of state will generally get stored by reference. The assumption is that you have the same installed modules wherever you want the object loaded/dumped. – Mike McKerns Dec 15 '21 at 00:56

0 Answers0