1

So to start I've never parallel processed anything ... so I don't really know what I am doing, however I have read about it a bit, and I've still got a question. my problem seems most like this article here How to do parallel programming in Python I have two functions that take a while and operate independently

# set dates to get data
d1 = DT.datetime(2015, 10, 1)
d2 = DT.datetime(2015, 10, 2)
# sets up a class to get various types of data from various places
gd = getdata(d1, d2)  
# both below return dictionary with unprocessed data
rawspec = gd.getwavespec(gaugenumber=0)  
rawwind = gd.getwind(gaugenumber=0)

Currently each function operates independently and returns a dictionary with data in it taking approximately 1-5 minutes each. (eg rawwind = {wind speed, direction, time}, rawspec = {time, Hs, Tp, Tm, 1D spectrum, 2D spectrum etc}) I would like to run each in parallel to speed up the data preparation in my work flow. when i use the above link as a frame work and try the following, I get an error that a TypeError: 'dict' object is not callable

from multiprocessing import Pool
pool = Pool()
result = pool.apply_async(gd.getwavespec(), ['gaugenumber=0'])
# here i get print statements that suggest the data are retrieved 

Data Gathered From Local Thredds Server

result.get(timeout=1000)  
Traceback (most recent call last):
    File "/home/spike/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-115-19dc220c614d>", line 1, in <module>
    result.get(timeout=100)
  File "/home/spike/anaconda2/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
TypeError: 'dict' object is not callable

when i check if the call was successful with result.successful() i get a False back, I'm not really sure how to troubleshoot this, when i run the rawspec = gd.getwavespec(gaugenumber=0) from the ipython console i get successful returns, any help is much appreciated

Community
  • 1
  • 1
SBFRF
  • 167
  • 2
  • 16

1 Answers1

2

Not sure if this helps but I think you are calling apply_async wrong. Try removing parentheses from the function name (use gd.getwavespec instead of gd.getwavespec() ) and sending a tuple. This is just a silly but working example:

from multiprocessing import Pool
from time import sleep

def foo(a):
    print a
    sleep(2)

q = Pool(5)
q.apply_async(foo, args= (42,))
q.apply_async(foo, args= (43,))

sleep(10)
Hannu
  • 11,685
  • 4
  • 35
  • 51