17

I'm receiving an object, t, from an api of type Object. I am unable to pickle it, getting the error:

  File "p.py", line 55, in <module>
    pickle.dump(t, open('data.pkl', 'wb'))
  File "/usr/lib/python2.6/pickle.py", line 1362, in dump
    Pickler(file, protocol).dump(obj)
  File "/usr/lib/python2.6/pickle.py", line 224, in dump
    self.save(obj)
  File "/usr/lib/python2.6/pickle.py", line 313, in save
    (t.__name__, obj))
pickle.PicklingError: Can't pickle 'Object' object: <Object object at 0xb77b11a0>

When I do the following:

for i in dir(t): print(type(i))

I get only string objects:

<type 'str'>
<type 'str'>
<type 'str'>
...
<type 'str'>
<type 'str'>
<type 'str'>

How can I print the contents of my Object object in order to understand why it cant be pickled?

Its also possible that the object contains C pointers to QT objects, in which case it wouldn't make sense for me to pickle the object. But again I would like to see the internal structure of the object in order to establish this.

Baz
  • 12,713
  • 38
  • 145
  • 268

4 Answers4

19

I would use dill, which has tools to investigate what inside an object causes your target object to not be picklable. See this answer for an example: Good example of BadItem in Dill Module, and this Q&A for an example of the detection tools in real use: pandas.algos._return_false causes PicklingError with dill.dump_session on CentOS.

>>> import dill
>>> x = iter([1,2,3,4])
>>> d = {'x':x}
>>> # we check for unpicklable items in d (i.e. the iterator x)
>>> dill.detect.baditems(d)
[<listiterator object at 0x10b0e48d0>]
>>> # note that nothing inside of the iterator is unpicklable!
>>> dill.detect.baditems(x)
[]

However, the most common starting point is to use trace:

>>> dill.detect.trace(True)
>>> dill.detect.errors(d)
D2: <dict object at 0x10b8394b0>
T4: <type 'listiterator'>
PicklingError("Can't pickle <type 'listiterator'>: it's not found as __builtin__.listiterator",)
>>> 

dill also has functionality to trace pointers referrers and referents to objects, so you can build a hierarchy of how objects refer to each other. See: https://github.com/uqfoundation/dill/issues/58

Alternately, there's also: cloudpickle.py and debugpickle.py, which are for the most part no longer developed. I'm the dill author, and hope to soon merge any functionality in these codes that is missing in dill.

Community
  • 1
  • 1
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
3

You may want to read the python docs and check your API's Object class afterwards.

With respect to the "internal structure of the object", usually instance attributes are stored in the __dict__ attribute (and since class attributes are not pickled you only care about the instance attributes) - but note that you'll also have to recursively inspect the __dict__s for each attribute.

iron9
  • 397
  • 2
  • 12
bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
3

I tried Dill but it didn't explain my issue. Instead, I used the following code from https://gist.github.com/andresriancho/15b5e226de68a0c2efd0, which happened to show a bug in my __getattribute__ override:

def debug_pickle(instance):
  """
  :return: Which attribute from this object can't be pickled?
  """
  attribute = None

  for k, v in instance.__dict__.iteritems():
      try:
          cPickle.dumps(v)
      except:
          attribute = k
          break

  return attribute

Edit: Here's a reproduction of my code, using pickle and cPickle:

class myDict(dict):

    def __getattribute__(self, item):
        # Try to get attribute from internal dict
        item = item.replace("_", "$")

        if item in self:
            return self[item]

        # Try super, which may leads to an AttribueError
        return super(myDict, self).__getattribute__(item)

myd = myDict()

try: 
    with open('test.pickle', 'wb') as myf:
        cPickle.dump(myd, myf, protocol=-1)
except:
    print traceback.format_exc()


try:
    with open('test.pickle', 'wb') as myf:
        pickle.dump(myd, myf, protocol=-1)
except:
    print traceback.format_exc()

Output:

Traceback (most recent call last):
File "/Users/myuser/Documents/workspace/AcceptanceTesting/ingest.py", line 35, in <module>
  cPickle.dump(myd, myf, protocol=-1)
UnpickleableError: Cannot pickle <class '__main__.myDict'> objects

Traceback (most recent call last):
File "/Users/myuser/Documents/workspace/AcceptanceTesting/ingest.py", line 42, in <module>
  pickle.dump(myd, myf, protocol=-1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1370, in dump
  Pickler(file, protocol).dump(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
  self.save(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 313, in save
  (t.__name__, obj))
PicklingError: Can't pickle 'myDict' object: {}

You'll see that the reason is because attribute names are being mangled by __getattribute__

Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100
  • Can you clarify what you mean `dill` didn't help, but the above code did? The above code is a weaker version of code that exists in `dill.detect`. Specifically, `dill.detect.badobjects` and `dill.detect.badtypes`. `dill` also has the other tools that are shown in my answer. If there's something that `dill` is missing here, I'd like to know. – Mike McKerns Feb 12 '16 at 14:19
  • Hi @MikeMcKerns. Dill looked great, I just don't think you'll be able to code for my stupidity :) I've updated my question with a reproduction of my buggy code. – Alastair McCormack Feb 12 '16 at 15:05
  • Ah… ok, you have a real corner case there. `dill` doesn't find the problem in the case, I agree. However, you will also note that `debug_pickle` doesn't actually return the bad attribute… it just happens to not catch the error that causes the pickling to fail. Useful, but seemingly unintentional. I'll make a note of this case as a ticket for `dill`. Thanks. – Mike McKerns Feb 12 '16 at 15:51
1

Here's an extension of Alastair's solution, in Python 3.

It:

  • is recursive, to deal with complex objects where the problem might be many layers deep.

    The output is in the form .x[i].y.z.... to allow you to see which members were called to get to the problem. With dict it just prints [key/val type=...] instead, since either keys or values can be the problem, making it harder (but not impossible) to reference a specific key or value in the dict.

  • accounts for more types, specifically list, tuple and dict, which need to be handled separately, since they don't have __dict__ attributes.

  • returns all problems, rather than just the first one.

def get_unpicklable(instance, exception=None, string='', first_only=True):
    """
    Recursively go through all attributes of instance and return a list of whatever
    can't be pickled.

    Set first_only to only print the first problematic element in a list, tuple or
    dict (otherwise there could be lots of duplication).
    """
    problems = []
    if isinstance(instance, tuple) or isinstance(instance, list):
        for k, v in enumerate(instance):
            try:
                pickle.dumps(v)
            except BaseException as e:
                problems.extend(get_unpicklable(v, e, string + f'[{k}]'))
                if first_only:
                    break
    elif isinstance(instance, dict):
        for k in instance:
            try:
                pickle.dumps(k)
            except BaseException as e:
                problems.extend(get_unpicklable(
                    k, e, string + f'[key type={type(k).__name__}]'
                ))
                if first_only:
                    break
        for v in instance.values():
            try:
                pickle.dumps(v)
            except BaseException as e:
                problems.extend(get_unpicklable(
                    v, e, string + f'[val type={type(v).__name__}]'
                ))
                if first_only:
                    break
    else:
        for k, v in instance.__dict__.items():
            try:
                pickle.dumps(v)
            except BaseException as e:
                problems.extend(get_unpicklable(v, e, string + '.' + k))

    # if we get here, it means pickling instance caused an exception (string is not
    # empty), yet no member was a problem (problems is empty), thus instance itself
    # is the problem.
    if string != '' and not problems:
        problems.append(
            string + f" (Type '{type(instance).__name__}' caused: {exception})"
        )

    return problems
Community
  • 1
  • 1
Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • Some types end up in the `else` clause that don't have a `__dict__`, eg, methods. I just ignored `AttributeError` if raised by `__dict__`. – Chris May 17 '23 at 10:38