4

I am trying to do something like

resource = MyResource()
def fn(x):
   something = dosemthing(x, resource)
   return something

client = Client()
results = client.map(fn, data)

The issue is that resource is not serializable and is expensive to construct. Therefore I would like to construct it once on each worker and be available to be used by fn.

How do I do this? Or is there some other way to make resource available on all workers?

Daniel Mahler
  • 7,653
  • 5
  • 51
  • 90

1 Answers1

2

You can always construct a lazy resource, something like

class GiveAResource():
    resource = [None]
    def get_resource(self):
        if self.resource[0] is None:
            self.resource[0] = MyResource()
        return self.resource[0]

An instance of this will serialise between processes fine, so you can include it as an input to any function to be executed on workers, and then calling .get_resource() on it will get your local expensive resource (which will get remade on any worker which appears later on).

This class would be best defined in a module rather than dynamic code.

There is no locking here, so if several threads ask for the resource at the same time when it has not been needed so far, you will get redundant work.

mdurant
  • 27,272
  • 5
  • 45
  • 74