The default pickle
module from the Python standard library does not allow for the serialization of functions with closures, lambdas, or functions in __main__
(see here).
I need to pickle an object using some custom functions that will not be importable where they will be unpickled. There are a few other Python object serializers, including dill
and cloudpickle
, that are capable of doing this.
The cloudpickle
documentation seems to be saying that even when you pickle using cloudpickle
, you can unpickle using the standard pickle
module. This is extremely attractive, because I cannot even install packages in the environment where I need to unpickle.
Indeed, the example in the documentation does basically the following:
Pickle:
>>> import cloudpickle
>>> squared = lambda x: x ** 2
>>> pickled_lambda = cloudpickle.dump(squared, open('pickled_file', 'w'))
Unpickle:
>>> import pickle
>>> new_squared = pickle.load(open('pickled_file', 'rb'))
>>> new_squared(2)
But, running that second block in an environment where cloudpickle
is not installed, even though it is never imported, yields the error:
"ImportError: No module named cloudpickle.cloudpickle"
Probably the most easily reproducible example would be to install cloudpickle
for Python2, run the first block, and then try to load in the pickled file with the second block using Python3 (where cloudpickle
was not installed).
What is going on here? Why does cloudpickle
need to be installed to run the standard pickle
load if it is not even called?