0

Given that code here I try to call :

p = ProgressBar(maxval=len(img_paths))

sm = SaliencyMaskSlic()
operations = [('img_resize', img_resize), ('sal_mask', sm.transform)]
args_list = [{'h_size':258}, {'cropped':True}]

pre_pipeline = Pipeline(ops=operations, arg_list=args_list)
ch = ColorHist('RGB', [6,6,6], [2,2], center=True, pre_pipeline = pre_pipeline)

for count,img_path in enumerate(img_paths):
    s.submit(ch.transform, (img_path,))
    p.update(count)
p.finish()

and it raises:

---------------------------------------------------------------------------
PicklingError                             Traceback (most recent call last)
<ipython-input-44-b62cf2241437> in <module>()
      9 
     10 for count,img_path in enumerate(img_paths):
---> 11     s.submit(ch.transform, (img_path,))
     12     p.update(count)
     13 p.finish()

/usr/local/lib/python2.7/dist-packages/pp-1.6.4-py2.7.egg/pp.pyc in submit(self, func, args, depfuncs, modules, callback, callbackargs, group, globals)
    458 
    459         sfunc = self.__dumpsfunc((func, ) + depfuncs, modules)
--> 460         sargs = pickle.dumps(args, self.__pickle_proto)
    461 
    462         self.__queue_lock.acquire()

PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

How should I handle such case in Python on order to use pp library. Or what are the other solutions available?

Slater Victoroff
  • 21,376
  • 21
  • 85
  • 144
erogol
  • 13,156
  • 33
  • 101
  • 155

2 Answers2

1

If want to use pp, but have stronger pickling -- then use dill. The dill serializer can pickle most of python. I'm the dill author, and have also extended pp to use dill.

pp requires serialization by extracting the source code from an object, much like the standard module inspect does with inspect.getsource. With dill.source.getsource, you have a much more powerful code inspection and source retrieval.

Python 2.7.8 (default, Jul 13 2014, 02:29:54) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dill import source
>>> def test_source(obj):
...     _obj = source._wrap(obj)
...     assert _obj(1.57) == obj(1.57)
...     src = source.getimportable(obj, alias='_f')
...     exec src in globals(), locals()
...     assert _f(1.57) == obj(1.57)
...     name = source.getname(obj)
...     assert name == obj.__name__ or src.split("=",1)[0].strip()
... 
>>> def test_ppmap(obj):
...     from pathos.pp import ParallelPythonPool
...     p = ParallelPythonPool(2)
...     x = [1,2,3]
...     assert map(obj, x) == p.map(obj, x)
... 
>>> from math import sin
>>> f = lambda x: x+1
>>> def g(x):
...   return x+2
... 
>>> for func in [g, f, abs, sin]:
...   test_source(func)
...   test_ppmap(func)
... 
>>> 

You'll need to get the version of pp that comes with pathos, and it's probably better to use the pathos.pp layer on top of my extension of pp.

This also works with class instances, and the like -- pathos.pp also provides an asynchronous map (see below) as well as an iterator map (not shown).

>>> class B:
...   def zap(self, x): 
...     return x**2 + self.y
...   y = 1
... 
>>> b = B()
>>> 
>>> res = p.amap(b.zap, range(5))
>>> res.get()
[1, 2, 5, 10, 17]

Update: I've now build a standalone fork of pp that uses dill for better serialization, called ppft. No need to install pathos to get it. If you don't care about having a Pool interface, then use ppft, if you want a Pool interface, then use pathos.pp.

Get the code here: https://github.com/uqfoundation

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
0

The error message tells you that the object you try to send is unpickable. The reason it is unpickable, is that's it is a classic class. In Python 3 new-style is the default, but in Python 2 you need to inherit from object:

Try to replace class ColorHist(): with class ColorHist(object): and see if it works for you.

jarondl
  • 1,593
  • 4
  • 18
  • 27