Get a function pickleable for using in Differential Evolution workers = -1

Question

#I EDITED MY ORIGINAL POST in order to put a simpler example. I use differential evolution (DE) of Scipy to optimize certain parameters. I would like to use all the PC processors in this task and I try to use the option "workers=-1"

The codition asked is that the function called by DE must be pickleable.

If I run the example in https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution, the optimisation works.

from scipy.optimize import rosen, differential_evolution
import pickle
import dill

bounds = [(0,2), (0, 2)]
result = differential_evolution(rosen, bounds, updating='deferred',workers=-1)
result.x, result.fun
(array([1., 1.]), 0.0)

But if I define a custom function 'Ros_custom', the optimisation crashes (doesn't give a result)

def Ros_custom(X):
    x = X[0]
    y = X[1]
    a = 1. - x
    b = y - x*x
    return a*a + b*b*100

result = differential_evolution(Ros_custom, bounds, updating='deferred',workers=-1)

If I try to pickle.dumps and pickle.loads 'Ros_custom' I get the same behaviour (optimisation crash, no answer).

If I use dill

Ros_pick_1=dill.dumps(Ros_custom)
Ros_pick_2=dill.loads(Ros_pick_1)
result = differential_evolution(Ros_pick_2, bounds, updating='deferred',workers=-1)
result.x, result.fun

I get the following message error

PicklingError: Can't pickle <function Ros_custom at 0x0000020247F04C10>: it's not the same object as __main__.Ros_custom

My question are: Why do I get the error ? and if there would be a way to get 'Ros_custom' pickleable in order to use all the PC processors in DE.

Thank you in advance for any advice.

I'm the author of `dill` (and `mystic`, which has a parallel DE solver). You are getting the error because your function is doing something that cannot be pickled. Without seeing more of your code, or some minimal representation that reproduces the issue, it's hard to give more help. You often can rewrite your code to become easier to serialize -- for example, convert nested functions to a class. `dill` also has serialization variants (see `dill.settings`) that can help get past issues with the global dict. There are other options that store class definitions... you'd need a class however :) — Mike McKerns, Oct 30 '20 at 01:24
@MikeMcKerns Thank you for your advice. The function 'another_funct' creates some lists but at a certain point it calls 'OMPython' (mod.simulate) to launch some simulations. I dilled all the functions appearing in the code by the same method (dill.dumps, dill.loads), it means I work only 'dilled' functions but I get the same error message. I also created a class (My_class) containing all my functions but it seems to be not compatible with DE since the latter asks for a 'function'. In any case I manage to introduce the required function trhough 'My_class.funct_required()', but then code hangs. — Fredy H., Oct 30 '20 at 11:36
with hangs=infinite loops. I feel kinda blocked at this point since I am new on Python. If you would have some advices of how to use all PC processor in a python code by other means I would thank you, since I have the impression that it wont possible through DE works=-1. Thank you in any case! — Fredy H., Oct 30 '20 at 12:17
Let me be a bit clearer -- my suggestion is for you to edit your post to show your code and what you are trying to do. It will be much easier for you to get help if people can actually run the code you are trying to get to work. In the abstract, I can only make some suggestions. You can try the DE solver in `mystic`, which works with `dill` and `multiprocess`... it's similar to (and I believe predates) the scipy DE code. — Mike McKerns, Oct 30 '20 at 12:27
@MikeMcKerns I understand. I edited my post to show the main parts of my code. Thanks in any case ! — Fredy H., Oct 30 '20 at 14:26
I edited my original post in order to put a simpler code that people can run. I still get the same error message. — Fredy H., Oct 31 '20 at 06:02
See point #2 in my response. Don't dump/load with `dill` before the optimization. It's making it worse, not better. You need to make sure the function is serializable within `scipy` -- but don't pass the dumped/loaded function into DE, use the original function. No need to import `pickle` or `dill`... `scipy` will handle it for you. — Mike McKerns, Nov 07 '20 at 12:12

score 0 · Answer 1 · answered Oct 31 '20 at 14:57

Two things:

I'm not able to reproduce the error you are seeing unless I first pickle/unpickle the custom function.
There's no need to pickle/unpickle the custom function before passing it to the solver.

This seems to work for me. Python 3.6.12 and scipy 1.5.2:

>>> from scipy.optimize import rosen, differential_evolution
>>> bounds = [(0,2), (0, 2)]
>>> 
>>> def Ros_custom(X):
...     x = X[0]
...     y = X[1]
...     a = 1. - x
...     b = y - x*x
...     return a*a + b*b*100
... 
>>> result = differential_evolution(Ros_custom, bounds, updating='deferred',workers=-1)
>>> result.x, result.fun
(array([1., 1.]), 0.0)
>>> 
>>> result
     fun: 0.0
 message: 'Optimization terminated successfully.'
    nfev: 4953
     nit: 164
 success: True
       x: array([1., 1.])
>>>

I can even nest a function inside of the custom objective:

>>> def foo(a,b):
...   return a*a + b*b*100
... 
>>> def custom(X):
...   x,y = X[0],X[1]
...   return foo(1.-x, y-x*x)
... 
>>> result = differential_evolution(custom, bounds, updating='deferred',workers=-1)
>>> result
     fun: 0.0
 message: 'Optimization terminated successfully.'
    nfev: 4593
     nit: 152
 success: True
       x: array([1., 1.])

So, for me, at least the code works as expected.

You should have no need to serialize/deserialize the function ahead of it's use in scipy. Yes, the function need to be picklable, but scipy will do that for you. Basically, what's happening under the covers is that your function will get serialized, passed to multiprocessing as a string, then distributed to the processors, then unpickled and used on the target processors.

Like this, for 4 sets on inputs, run one per processor:

>>> import multiprocessing as mp
>>> res = mp.Pool().map(custom, [(0,1), (1,2), (4,9), (3,4)])
>>> list(res)
[101.0, 100.0, 4909.0, 2504.0]
>>>

Older versions of multiprocessing had difficulty serializing functions defined in the interpreter, and often needed to have the code executed in a __main__ block. If you are on windows, this is still often the case... and you might also need to call mp.freeze_support(), depending on how the code in scipy is implemented.

I tend to like dill (I'm the author) because it can serialize a broader range of objects that pickle. However, as scipy uses multiprocessing, which uses pickle... I often choose to use mystic (I'm the author), which uses multiprocess (I'm the author), which uses dill. Very roughly, equivalent codes, but they all work with dill instead of pickle.

>>> from mystic.solvers import diffev2
>>> from pathos.pools import ProcessPool
>>> diffev2(custom, bounds, npop=40, ftol=1e-10, map=ProcessPool().map)
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 42
         Function evaluations: 1720
array([1.00000394, 1.00000836])

With mystic, you get some additional nice features, like a monitor:

>>> from mystic.monitors import VerboseMonitor
>>> mon = VerboseMonitor(5,5)
>>> diffev2(custom, bounds, npop=40, ftol=1e-10, itermon=mon, map=ProcessPool().map)
Generation 0 has ChiSquare: 0.065448
Generation 0 has fit parameters:
 [0.769543181527466, 0.5810893880113548]
Generation 5 has ChiSquare: 0.065448
Generation 5 has fit parameters:
 [0.588156685059123, -0.08325052939774935]
Generation 10 has ChiSquare: 0.060129
Generation 10 has fit parameters:
 [0.8387858177101133, 0.6850849855634057]
Generation 15 has ChiSquare: 0.001492
Generation 15 has fit parameters:
 [1.0904350077743412, 1.2027007403275813]
Generation 20 has ChiSquare: 0.001469
Generation 20 has fit parameters:
 [0.9716429877952866, 0.9466681129902448]
Generation 25 has ChiSquare: 0.000114
Generation 25 has fit parameters:
 [0.9784047411865372, 0.9554056558210251]
Generation 30 has ChiSquare: 0.000000
Generation 30 has fit parameters:
 [0.996105436348129, 0.9934091068974504]
Generation 35 has ChiSquare: 0.000000
Generation 35 has fit parameters:
 [0.996589586891175, 0.9938925277204567]
Generation 40 has ChiSquare: 0.000000
Generation 40 has fit parameters:
 [1.0003791956048833, 1.0007133195321427]
Generation 45 has ChiSquare: 0.000000
Generation 45 has fit parameters:
 [1.0000170425596364, 1.0000396089375592]
Generation 50 has ChiSquare: 0.000000
Generation 50 has fit parameters:
 [0.9999013984263114, 0.9998041148375927]
STOP("VTRChangeOverGeneration with {'ftol': 1e-10, 'gtol': 1e-06, 'generations': 30, 'target': 0.0}")
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 54
         Function evaluations: 2200
array([0.99999186, 0.99998338])
>>>

All of the above are running in parallel.

So, in summary, the code should work as is (and without pre-pickling) -- maybe unless you are on windows, where you might need to use freeze_support and run the code in the __main__ block.

Hello, thank you for your awsome contribution for community. I do use Windows and I can't get results with the example (Ros_custom) even if I use `if __name__ == '__main__':` before calling D.E. Just for knowledge, if I import the function 'Ros_custom' from another notebook via `from ipynb.fs.full.NameNotebook import Ros_custom`, DE with option `worker=-1` works. I used `mystic` and it works for the example given here (no need to import from another notebook). In my project, my Objective function is like black box, when I compare `mystic diffev2` with DE I get slightly different values. — Fredy H., Oct 31 '20 at 19:24
I suppose that I will have to play with the parameters, although if I use the following parameters `npop=10,ScalingFactor=0.8,CrossProbability=0.9` I kinda get the same results, although the simulation is longer. Then, when I add the argument `map=ProcessPool().map` in `diffev2` in order to use all the PC processors, I get an error. Is this the equivalent way to 'workers=-1' ? I wonder too which would be the way to make DE working on Windows without need to import functions from another notebook (since in my project, due to complexity, it seems not to work neither). Thanks in any case! — Fredy H., Oct 31 '20 at 19:40
The error is : Traceback (most recent call last): File "...Anaconda3\lib\site-packages\multiprocess\pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "..Anaconda3\lib\site-packages\multiprocess\pool.py", line 48, in mapstar return list(map(*args)) File "..\Anaconda3\lib\site-packages\pathos\helpers\mp_helper.py", line 15, in func = lambda args: f(*args) File "..\Anaconda3\lib\site-packages\mystic\tools.py", line 364, in function_wrapper return cost_function(_x) + penalty_function(_x) — Fredy H., Oct 31 '20 at 19:58
File "..Anaconda3\lib\site-packages\mystic\tools.py", line 402, in function_wrapper return target_function(x) File "..Anaconda3\lib\site-packages\mystic\tools.py", line 375, in function_wrapper fval = the_function(x, *extra_args) File "", line 59, in Fonct_Obj_General NameError: name 'Simulator' is not defined — Fredy H., Oct 31 '20 at 19:59
@FredyH.: very hard to discern what is going on when you are posting tracebacks to the comments. It'd probably be better as an additional question, or a posting on the appropriate github page. However, a `NameError` generally points to a coding error in the naming... but can come from serialization issue on windows. You ran from a `__main__` block, but did you use `freeze_support`? Like this: https://github.com/uqfoundation/pathos/blob/master/examples/test_mpmap_dill.py#L72 — Mike McKerns, Nov 01 '20 at 14:02
Yes, I did use freeze_support. Well, the thing is that my code works fine when DE uses only 1 processor. I just posted a new question about this, anyways, thanks for your help! — Fredy H., Nov 01 '20 at 18:50

score 0 · Answer 2 · answered Sep 28 '22 at 12:11

Writing the function separately from the code worked for me.

create rosen_custom.py with code inside:

import numpy as np
def rosen(x):
    x = np.array(x)
    r = np.sum(100.0 * (x[1:] - x[:-1]**2.0)**2.0 + (1 - x[:-1])**2.0,
                  axis=0)
    return r

Then use it in DE:

from scipy.optimize import differential_evolution
from rosen_custom import rosen
import numpy as np
bounds = [(0,2), (0, 2), (0, 2), (0, 2), (0, 2)]

result = differential_evolution(rosen_custom, bounds, 
                                updating='deferred',workers=-1)
print(result.x, result.fun)

Get a function pickleable for using in Differential Evolution workers = -1

2 Answers2

Linked