27

I have a list of objects of various types that I want to pickle. I would like to pickle only those which are pickleable. Is there a standard way to check if an object is of pickleable type, other than trying to pickle it?

The documentation says that if a pickling exception occurs it may be already after some of the bytes have been written to the file, so trying to pickle the objects as a test doesn't seem like a good solution.

I saw this post but it doesn't answer my question.

Community
  • 1
  • 1
Bitwise
  • 7,577
  • 6
  • 33
  • 50
  • Trying to write it to a file can be a solution. Just don't write it to your real output file but somewhere else. To /dev/null or somewhere. – Hyperboreus Jul 26 '13 at 03:00
  • Here are the rules for what can be pickled: https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled – slushy Dec 09 '16 at 04:11
  • 1
    why did you accept the duck typing answer when dill has the functionality you want with `dill.pickles(f)`? – Charlie Parker Mar 15 '21 at 19:13
  • I thumbed up the `duck` typing as well as the `dill.pickles` answer because it provides more details and further reading. But the OP probably chose it because the pickles answer came in 2 years after the duck typing answer. – Chris Rudd Aug 20 '21 at 20:17
  • @charlie-parker sometimes you don't wanna install a new package for every little feature. every package means more dependencies, more to maintain, more attack surface to be compromised (package components and repository). I prefer stdlib solutions over unknown packages of dubious origin. – Ed_ Aug 05 '23 at 14:25

3 Answers3

28

There's the dill.pickles method in dill package that does just that.

>>> class Foo(object):
...   x = iter([1,2,3])
... 
>>> f = Foo()     
>>> 
>>> dill.pickles(f)
False

We can use methods in dill to look for what causes the failure.

>>> dill.detect.badtypes(f)
<class '__main__.Foo'>
>>> dill.detect.badtypes(f, depth=1)
{'__setattr__': <type 'method-wrapper'>, '__reduce_ex__': <type 'builtin_function_or_method'>, '__reduce__': <type 'builtin_function_or_method'>, '__str__': <type 'method-wrapper'>, '__format__': <type 'builtin_function_or_method'>, '__getattribute__': <type 'method-wrapper'>, '__class__': <type 'type'>, '__delattr__': <type 'method-wrapper'>, '__subclasshook__': <type 'builtin_function_or_method'>, '__repr__': <type 'method-wrapper'>, '__hash__': <type 'method-wrapper'>, 'x': <type 'listiterator'>, '__sizeof__': <type 'builtin_function_or_method'>, '__init__': <type 'method-wrapper'>}
>>> dill.detect.badtypes(f, depth=1).keys()
['__setattr__', '__reduce_ex__', '__reduce__', '__str__', '__format__', '__getattribute__', '__class__', '__delattr__', '__subclasshook__', '__repr__', '__hash__', 'x', '__sizeof__', '__init__']

So, the only thing that's failing that's not a "built-in" method of the class is x… so that's a good place to start. Let's check 'x', then replace it with something else if it's the problem.

>>> dill.pickles(Foo.x)
False
>>> Foo.x = xrange(1,4)
>>> dill.pickles(Foo.x)
True

Yep, x was causing a failure, and replacing it with an xrange works because dill can pickle an xrange. What's left to do?

>>> dill.detect.badtypes(f, depth=1).keys()
[]
>>> dill.detect.badtypes(f, depth=1)       
{}
>>> dill.pickles(f)                 
True
>>> 

Apparently (likely because references to x in the class __dict__ now pickle), f now pickles… so we are done.

dill also provides trace to show the exact path in pickling the object.

>>> dill.detect.trace(True)
>>> dill.pickles(f)
T2: <class '__main__.Foo'>
F2: <function _create_type at 0x10e79b668>
T1: <type 'type'>
F2: <function _load_type at 0x10e79b5f0>
T1: <type 'object'>
D2: <dict object at 0x10e7c6168>
Si: xrange(1, 4)
F2: <function _eval_repr at 0x10e79bcf8>
D2: <dict object at 0x10e7c6280>
True
Azat Ibrakov
  • 9,998
  • 9
  • 38
  • 50
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
  • When I try this I get so many functions that I'm more unsure where to start (pickling a class, it seems to return all submethods) – Roelant Jan 07 '21 at 10:43
  • @Roelant: I assume when you say you "try this", you mean you look at the trace. Pickling is recursive, and thus you will see a lot of "sub-objects". Each time you see a marker like `F2` or `D1`, another "sub-object" is being opened for examination, and there's a similar closing marker for when the object is actually pickled. – Mike McKerns Jan 07 '21 at 14:21
7

I would propose duck testing in this case. Try to pickle into a temporary file or a memory file, as you find suitable, then if it fails discard the result, if it succeeds rename.

Why?

In python you can check if the object has some properties in two ways.

Check if object is an instance of some Abstract Base Class. E.g. Number "The root of the numeric hierarchy. If you just want to check if an argument x is a number, without caring what kind, use isinstance(x, Number)."

Or try it and then handle exceptions. This occurs during many occasions. The pythonic philosopy is based around the duck. Duck typing, duck test, and EAFP are the keywords.

I even believe the 1st one has been properly introduced with python3 under the pressure from the part of the community, while many still strongly believe duck is the way to go with python.

AFAIK there is no special preconditions that can be checked, nor any ABC that object can be checked against in case of pickling. So all that is left is duck.

Maybe something else could be attempted but probably it is not worth of it. It would be very hard to do manual introspection of the object to find out preliminarily if it's suitable for pickling.

luk32
  • 15,812
  • 38
  • 62
  • Thanks, I am familiar with python duck testing. It just surprises me that there isn't a better way of checking pickleability. Does every pickleable object need to implement certain methods? Can't we just duck test for one of these methods? – Bitwise Jul 26 '13 at 12:43
  • Well when I first stumbled upon it I was baffled. I needed to check if object is iterable. The simplest method I found is `try: mock = iter(data[0]) except TypeError:`. And it is quite against some python-ways because ideally I should treat it as iterable and pass further. However, this had serious drawback of errors popping up too low to find them easily. From what I read in the docs, python uses its internal knowledge to pickle objects. Its not like `__str__`. You can provide some helpers in strange cases, but they are not necessary everywhere. I didn't find any other reliable method. – luk32 Jul 26 '13 at 14:19
  • That explains it, thanks. Still seems to me like a weird design choice - but I am probably not seeing all the angles on this. – Bitwise Jul 26 '13 at 14:31
  • @Bitwise Pickling is a complex and, perhaps more importantly, *recursive* process that depends heavily on what's in scope where. Remember that Python is a dynamic language, so "what is where" is not an easy question to ask at all, and indeed asking can change the answer. So in fact the only possible way to determine if an object is pickleable... is to try pickling it. Actually, a perfect `isPickleable()` would probably need to solve the halting problem. – Schilcote Jul 01 '15 at 04:41
6

dill allows for pickling more things that the builtin pickle.

This should do what you what, I think:

def is_picklable(obj):
  try:
    pickle.dumps(obj)

  except pickle.PicklingError:
    return False
  return True
Max Bileschi
  • 2,103
  • 2
  • 21
  • 19