1

I need to pass a variable to the setup() method of dispy node so i can tell the node which data set to load from a config file. Else i have to write a specific script for each data set and this is going to be painful.

def setup(): # executed on each node before jobs are scheduled
    # read data in file to global variable
    global data
    data = open('file.dat').read()
    return 0
...
if __name__ == '__main__':
    import dispy
    cluster = dispy.JobCluster(compute, depends=['file.dat'], setup=setup, cleanup=cleanup)

So i want to pass the string "file.dat" to setup so each node can instantiate the data once (as its large).

CpILL
  • 6,169
  • 5
  • 38
  • 37
  • Did you write `setup`? If you didn't is there any documentation for it? Do we get to know whether it takes an argument, and if so, what it is and what it means? – Paul Cornelius Jul 07 '15 at 01:16
  • yes, i wrote setup. I'm going off the examples in the docs which don't seem to take arguments and there doesn't seem to be an obvious way to pass any across (I'm guessing the function gets picked and passed to each node and then called?) – CpILL Jul 07 '15 at 01:47

1 Answers1

3

Let me see if I understand the problem. You want to pass an argument to setup, but the actual call of setup occurs somewhere in the function JobCluster. That call does not know it should pass an argument. Is that correct?

The solution is to use the standard library functools.partial. You do something like this:

if __name__ == '__main__':
    import dispy
    f = functools.partial(setup,"file.dat")
    cluster = dispy.JobCluster(compute, depends=['file.dat'], setup=f, cleanup=cleanup)

The object returned by partial, when called with no arguments, calls setup with one positional argument ("file.dat"). You have to rewrite setup to handle this argument, like so:

def setup(s): # executed on each node before jobs are scheduled
    # read data in file to global variable
    global data
    data = open(s).read()
    return 0
Paul Cornelius
  • 9,245
  • 1
  • 15
  • 24