0

so i Worte some code with basic structure like this:

from numpy import *
from dataloader import loadfile
from IPython.parallel import Client
from clustering import * 

data = loadfile(0)

N_CLASSES = 10
rowmax = nanmax(data.values, 0)
rowmin = nanmin(data.values, 0)

# Defines the size of block processed at a time
BLOCK_SIZE = 50000
classmean, classcov, classcovinv, classlogdet, classlogprob = init_stats(data, N_CLASSES, rowmax, rowmin)

client = Client()
ids    = client.ids
nodes  = len(ids)
view   = client.load_balanced_view()
dview  = client[:]

def get_ml_class(data, args): do sth
dview.scatter('datablock', data)
dview.execute('res1, res2 = get_ml_class(datablock, args)', block=False)

the output of dview.execute part is

<AsyncResult: execute>

which means it is executed, however, when i was trying to pull the result by

dview.pull(['res1','res2'], block=True)

it shows:

NameError: name 'res1' is not defined

Can someone please tell me what is wrong with my code?? Thank you so much!

Thomas K
  • 39,200
  • 7
  • 84
  • 86
user3720918
  • 57
  • 1
  • 4

1 Answers1

0

Let us make a simpler example:

from IPython.parallel import Client

rc = Client()
dview = rc[:]
dview.scatter('a', Range(16))
dview.execute('res1,res2 = a[0], a[1]', block=False)
dview.pull(['res1'], block=True)

This works as expected and gives a the result:

[[0], [4], [8], [12]]

So, we do at least this one right. But let me change the code a bit:

from IPython.parallel import Client

rc = Client()
dview = rc[:]
dview.scatter('a', Range(16))
dview.execute('res1,res2 = a[0], b[1]', block=False)
dview.pull(['res1'], block=True)

Now we have the NameError. Why?

Because there is an error on the execute line (it references to variable b which does not exist). The non-blocking execute does not complain much. In the first (working) case the status is:

<AsyncResult: finished>

and in the second (non-working) case:

<AsyncResult: execute>

Other than that it is very quiet, and the second message does not necessarily mean an error has occured. In order to see the real error message, change blocking to True. Then you'll see what is going wrong.

If you want to know if your non-blocking execute works, you have to capture the AsyncResult object returned by execute. It has several interesting methods, but you would be most interested in ready and successful methods:

ar = dview.execute(...)
ar.ready()                # True if the process has finished
ar.successful()           # True if there were no exceptions raised

Also, the possible exceptions raised during the execution can be fetched by using the get method of the AsyncResult object. For example my bad example gives in the interactive shell:

>>> ar.get()

[0:execute]: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)<ipython-input-1-608f57d70b2f> in <module>()
----> 1 res1,res2=a[0]**2,b[1]**2
NameError: name 'b' is not defined

[1:execute]: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)<ipython-input-1-608f57d70b2f> in <module>()
----> 1 res1,res2=a[0]**2,b[1]**2
NameError: name 'b' is not defined

...

So, as a summary: Try to find out what goes wrong with the function you try to run remotely. Now it seems to raise some execption. The error might have something to do with args which does not seem to be available for the remote scripts. Maybe a scatter is missing?

DrV
  • 22,637
  • 7
  • 60
  • 72