4

I am trying to use python multiprocessing module to process a large set of urls for which I am creating worker processes using a multiprocessing.Pool object as shown below.

from multiprocessing import Pool, TimeoutError, cpu_count

class MyClass:
    def square(self, x):
        return x*x

    @staticmethod
    def getNumbers():
        return range(10)

    def calculate(self):
        pool = Pool(processes=min(cpu_count(),8))
        results = [pool.apply(self.square,(i,)) for i in self.getNumbers()]
        pool.close()
        pool.join()
        for result in results:
            print result


if __name__ == '__main__':
    instance = MyClass()
    instance.calculate()

However the above piece of code results in a pickling error as follows:

Traceback (most recent call last):
  File "multi.py", line 24, in <module>
    instance.calculate()
  File "multi.py", line 15, in calculate
    results = [pool.apply(self.square,(i,)) for i in self.getNumbers()]
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 244, in apply
    return self.apply_async(func, args, kwds).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
    raise self._value
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

NOTE A similar question has been asked previously on SO by someone else but remains unanswered: cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

EDIT Gave a better sample code example

Community
  • 1
  • 1
bawejakunal
  • 1,678
  • 2
  • 25
  • 54
  • Shouldn't `pool.apply(self.processURL, injectionPoint, "GET")` be `pool.apply(self.processURL, (injectionPoint, "GET"))` ? – Shaung Feb 12 '16 at 06:06
  • @Shaung yeah, sorry made a mistake while copying and editing code here. Corrected now. Thanks for noticing :) – bawejakunal Feb 12 '16 at 06:12
  • I found a "trick" on the following link, but I guess this is a poor hack, not sure. Please help me understand whats the problem with pickling `:` in `multiprocessing` and how to handle that correctly for use within class instance methods. Trick mentioned on this link: http://www.rueckstiess.net/research/snippets/show/ca1d7d90 – bawejakunal Feb 12 '16 at 06:24

1 Answers1

3

You cannot pickle instance methods using the multiprocessing package in python. Instance methods aren't listed in the pickling documentation.

If you don't mind using an external library, you can look at multiprocess, which is a drop-in replacement for python's multiprocessing. To make use of the library, you would do the following:

  • pip install multiprocess
  • replacefrom multiprocessing import Pool, TimeoutError, cpu_count with
    from multiprocess import Pool, TimeoutError, cpu_count

I have tested your example on my machine and it actually executes using multiprocess.

chriscz
  • 169
  • 2
  • 9