4

short short version:

I am having trouble parallelizing code which uses instance methods.

Longer version:

This python code produces the error:

Error
Traceback (most recent call last):
  File "/Users/gilzellner/dev/git/3.2.1-build/cloudify-system-tests/cosmo_tester/test_suites/stress_test_openstack/test_file.py", line 24, in test
self.pool.map(self.f, [self, url])
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/pathos/multiprocessing.py", line 131, in map
return _pool.map(star(f), zip(*args)) # chunksize
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/multiprocess/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/multiprocess/pool.py", line 567, in get
raise self._value
AttributeError: 'Test' object has no attribute 'get_type'

This is a simplified version of a real problem I have.

import urllib2
from time import sleep
from os import getpid
import unittest
from pathos.multiprocessing import ProcessingPool as Pool

class Test(unittest.TestCase):

    def f(self, x):
        print urllib2.urlopen(x).read()
        print getpid()
        return

    def g(self, y, z):
        print y
        print z
        return

    def test(self):
        url = "http://nba.com"
        self.pool = Pool(processes=1)
        for x in range(0, 3):
            self.pool.map(self.f, [self, url])
            self.pool.map(self.g, [self, url, 1])
        sleep(10)

I am using pathos.multiprocessing due to the recommendation here: Multiprocessing: Pool and pickle Error -- Pickling Error: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

Before using pathos.multiprocessing, the error was:

"PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed"
Community
  • 1
  • 1
Gil Zellner
  • 889
  • 1
  • 9
  • 20
  • Please paste the whole traceback - somehow the instance of `Test` is being passed instead of `url` – matino Aug 03 '15 at 14:23
  • done, thanks for your help! – Gil Zellner Aug 03 '15 at 14:41
  • Do you need to use instance here? Can't you use functions? – matino Aug 03 '15 at 14:50
  • this is a smaller version of something much bigger that I am working on. (https://github.com/cloudify-cosmo/cloudify-system-tests). I purposely made this as a scaled down version so I can put it here. – Gil Zellner Aug 03 '15 at 14:53
  • Are you on windows? This `self._value` error often happens on windows when you don't use `pathos.helpers.freeze_support`, and run from `__main__`. On non-windows systems, this error is much less common. – Mike McKerns Aug 03 '15 at 19:14
  • OSX, using virtualenv as to not interfere with system python – Gil Zellner Aug 04 '15 at 08:09
  • Ok, then you don't need `freeze_support`. This obtuse `self._value` error often comes from a pickling, coding, or other error that is thrown inside a multiprocessing call. – Mike McKerns Aug 04 '15 at 11:12

2 Answers2

1

You're using multiprocessing map method incorrectly.
According to python docs:

A parallel equivalent of the map() built-in function (it supports only one iterable argument though).

Where standard map:

Apply function to every item of iterable and return a list of the results.

Example usage:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [1, 2, 3]))

What you're looking for is apply_async method:

def test(self):
    url = "http://nba.com"
    self.pool = Pool(processes=1)
    for x in range(0, 3):
        self.pool.apply_async(self.f, args=(self, url))
        self.pool.apply_async(self.g, args=(self, url, 1))
    sleep(10)
matino
  • 17,199
  • 8
  • 49
  • 58
  • pathos.multiprocessing doesn't have an "apply_async" method. but nevermind, I worked around that – Gil Zellner Aug 03 '15 at 15:01
  • 1
    You can also use `map` but you need to pass arguments in another way ;) – matino Aug 03 '15 at 15:02
  • so now this example works, but I get a different problem in my real code, will open a new question. – Gil Zellner Aug 03 '15 at 15:03
  • I'm the author of `pathos`. A `pool` in `pathos` has `apipe` instead of `apply_async`. Or, if you want to use `apply_async`, then you can directly use the forked `multiprocessing` code in `pathos.helpers.mp.Pool`… which does have `apply_async`. Similarly, `pathos` uses `amap` instead of `map_async`… however `pathos.helpers.mp.Pool` does have `map_async`. – Mike McKerns Aug 03 '15 at 19:07
  • I am now getting another error and I am not sure how to ask about this, since I cannot include the entire framework in one post. but essentially: TypeError: super(type, obj): obj must be an instance or subtype of type – Gil Zellner Aug 04 '15 at 08:12
  • Try to narrow down your code as much as you can and post another question. AFAIK you're `super` call is incorrect or insufficient. – matino Aug 04 '15 at 08:46
  • I agree, that looks like a coding error with `super`, as opposed to a pickling issue, as @matino mentioned above. – Mike McKerns Aug 04 '15 at 11:08
-2

The error indicates you are trying to read an attribute which is not defined for the object Test.

AttributeError: 'Test' object has no attribute 'get_type'"

In your class test, you haven't defined get_type method or any other attribute hence the error.

Usman Azhar
  • 746
  • 5
  • 13