71

I know that in order to be picklable, a class has to overwrite __reduce__ method, and it has to return string or tuple.

How does this function work? What the exact usage of __reduce__? When will it been used?

pppery
  • 3,731
  • 22
  • 33
  • 46
oyjh
  • 1,248
  • 1
  • 9
  • 20
  • 3
    Just a note: A class does not have to overwrite the `__reduce__` method in order to be picklable. At least in recent versions of Python. As the [documentation states](https://docs.python.org/3/library/pickle.html#pickling-class-instances): _"In most cases, no additional code is needed to make instances picklable."_ – Jeyekomon Jun 24 '21 at 08:11

1 Answers1

104

When you try to pickle an object, there might be some properties that don't serialize well. One example of this is an open file handle. Pickle won't know how to handle the object and will throw an error.

You can tell the pickle module how to handle these types of objects natively within a class directly. Lets see an example of an object which has a single property; an open file handle:

import pickle

class Test(object):
    def __init__(self, file_path="test1234567890.txt"):
        # An open file in write mode
        self.some_file_i_have_opened = open(file_path, 'wb')

my_test = Test()
# Now, watch what happens when we try to pickle this object:
pickle.dumps(my_test)

It should fail and give a traceback:

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  --- snip snip a lot of lines ---
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle file objects

However, had we defined a __reduce__ method in our Test class, pickle would have known how to serialize this object:

import pickle

class Test(object):
    def __init__(self, file_path="test1234567890.txt"):
        # Used later in __reduce__
        self._file_name_we_opened = file_path
        # An open file in write mode
        self.some_file_i_have_opened = open(self._file_name_we_opened, 'wb')
    def __reduce__(self):
        # we return a tuple of class_name to call,
        # and optional parameters to pass when re-creating
        return (self.__class__, (self._file_name_we_opened, ))

my_test = Test()
saved_object = pickle.dumps(my_test)
# Just print the representation of the string of the object,
# because it contains newlines.
print(repr(saved_object))

This should give you something like: "c__main__\nTest\np0\n(S'test1234567890.txt'\np1\ntp2\nRp3\n.", which can be used to recreate the object with open file handles:

print(vars(pickle.loads(saved_object)))

In general, the __reduce__ method needs to return a tuple with at least two elements:

  1. A blank object class to call. In this case, self.__class__
  2. A tuple of arguments to pass to the class constructor. In the example it's a single string, which is the path to the file to open.

Consult the docs for a detailed explanation of what else the __reduce__ method can return.

iron9
  • 397
  • 2
  • 12
VooDooNOFX
  • 4,674
  • 2
  • 23
  • 22
  • 10
    But the same may be achieve using `__get_state__`, `__set_state__`. – Sklavit Apr 06 '17 at 14:43
  • 7
    @Sklavit which is better to use? `__get_state__`/`__set_state__` or `__reduce__`? – Jason S May 05 '17 at 18:44
  • 3
    @JasonS As I understand `__get_state_`/`__set_state__` are high level interface, `__reduce__` - low level. So I prefer to use high level interface. – Sklavit May 08 '17 at 15:49
  • 2
    Do I understand correctly that `__getstate__` is not called when `__reduce__` is defined? – Mr_and_Mrs_D Apr 21 '20 at 09:44
  • 7
    Pickling an object is presumably about recording the _current state_ of the object. A file handle is a good example of a property that contains a lot of its own state and just reopening the file wouldn't be enough to restore the state of the `test` object. You'd also want to record `self.some_file_i_have_opened.tell()` and any other state that was of interest to the `test` class. See https://docs.python.org/3/library/pickle.html#handling-stateful-objects for a more complete example (using `__get_state__`/`__set_state__` as it happens). – John Marshall Jun 25 '20 at 08:38
  • 1
    I got here having issues with `multiprocessing` pickling my high-level python class objects, hope I can fix it using this. Thanks! – Mohammad Moallemi Feb 24 '21 at 13:51