The easiest thing to do is to do this manually.
Without an example of your code, I have to make a lot of assumptions and write something pretty vague, so let's take the simplest case.
Assume you're using dill
manually, so your existing code looks like this:
obj = function_that_creates_giant_object()
for i in range(zillions):
results.append(pool.apply(func, (dill.dumps(obj),)))
All you have to do is move the dumps
out of the loop:
obj = function_that_creates_giant_object()
objpickle = dill.dumps(obj)
for i in range(zillions):
results.append(pool.apply(func, (objpickle,)))
But depending on your actual use, it may be better to just stick a cache in front of dill
:
cachedpickle = functools.lru_cache(maxsize=10)(dill.dumps)
obj = function_that_creates_giant_object()
for i in range(zillions):
results.append(pool.apply(wrapped_func, (cachedpickle(obj),))
Of course if you're monkeypatching multiprocessing
to use dill
in place of pickle
, you can just as easy patch it to use this cachedpickle
function.
If you're using multiprocess
, which is a forked version of multiprocessing
that pre-substitutes dill
for pickle
, it's less obvious how to patch that; you'll need to go through the source and see where it's using dill
and get it to use your wrapper. But IIRC, it just does a import dill as pickle
somewhere and then uses the same code as (a slightly out-of-date version of multiprocessing
), so it isn't all that different.
In fact, you can even write a module that exposes the same interface as pickle
and dill
:
import functools
import dill
def loads(s):
return dill.loads(s)
@lru_cache(maxsize=10)
def dumps(o):
return dill.dumps(o)
… and just replace the import dill as pickle
with import mycachingmodule as pickle
.
… or even monkeypatch it after loading with multiprocess.helpers.pickle = mycachingmodule
(or whatever the appropriate name is—you're still going to have to find where that relevant import
happens in the source of whatever you're using).
And that's about as complicated as it's likely to get.