0

The main idea here is to know how black-box functions are used in objective function definition and how optimization algorithms call these functions.

Let' assume that we have a function defined as follows:

f is an objective function to be minimized for a given problem.Lets say:

f(Xi,Yi)=(Ai.Xi)+(Bi.Xi.Yi) for i=1,2...n

where Yi= N(X1,X2,...Xn) is a blackbox function(a simulation) whose analytical form is unknown which takes all Xi's as inputs

N refers to a network being simulated.

Ai and Bi are constants

The problem has the following constraints:

X1+X2+...+Xn = C

The below function definition is just to show how I've called my simulation results and used it in my optimization objective. I'm also open to suggestions if it could be made even better. (But my main question follows my function definition)

def fun(X): 
   sim_inpfile(X)  # function that has been already defined which is called now by taking x as arguments to write the input file to a simulation
   Popen('python Test02.py')  # running a python code to execute the simulation       
   jn.output.Results  # Extracting results of simulation by importing a custom made python module called jn 
   Y=[] # Simulation Result
   for i in range(len(jn.output.Results.res)):
      if jn.output.Results.res[i].name in nodes:
          y += [(jn.output.Results.res[i].pr)]  # extracting y as a list from the simulation results
   sum = 0 # optimization objective
   for i in range(len(a)):  # len[a]=len[b]=len[X]=len[Y]
      sum += a[i]*X[i]+ b[i]*X[i]*Y[i] #a and b are constants which are lists with the same size as x and y
  return sum   #the ultimate objective function that takes y(simulation results) as arguments to return the optimization objective value

I call a python optimization solver now.

res = scipy.optimize.minimize(fun, x0, method='SLSQP', bounds=bnds, constraints=cons)    # x0=initial x values, bounds= Variable bounds on X, Constraint =equality constraint 

Qns:

  • For each iteration of the algorithm, will the objective function be called for evaluation?

  • If yes, is the way I've coded, an appropriate way to return my objective
    function with the black-box function evaluation at each iteration of the solver?

  • Also if there or similar problems in any of python optimization packages or similar questions and examples, please share their links or threads.

I appreciate you for your time and your experience may help me solve this problem.

Janson 7
  • 1
  • 2

1 Answers1

1

This is a very broad question and optimization is a very complex topic! So just a few remarks:

For each iteration of the algorithm, will the objective function be called for evaluation?

As you call it, the function will called multiple times per iteration as you did not provide a jacobian. SLSQP will use numerical-differentiation to reason about the step-direction it will take (and it's length)!

If yes, is the way I've coded, an appropriate way to return my objective function with the black-box function evaluation at each iteration of the solver?

No, probably not!

The first obvious thing is the process-based call of your simulation which induces overhead. If your simulation is dominating in regards to time, this is less relevant, but in this case it's the wrong approach in general.

Other remarks

  • Most of scipy's optimizers are assuming a smooth and deterministic optimization problem. Yours might invalidate both (and a lot of bad stuff can happen). Smoothness is probably tough to reason about, but deterministic/random could be analyzed when you understand your optimization (random-numbers?)
    • Keep in mind, that randomness might even be more critical if it's an iterative-simulation which is stopped after x-iterations without any formal convergence-criterion
    • All these things are even much more dramatic in the case of numerical-differentiation (simplified example: call with x vs. call with x+1e-15; if the simulation-error is somewhat in that scale or higher: bad)
  • Those first-order / second-order optimizers (like SLSQP) are not designed to be efficient in terms of function-evaluations
    • Numerical-differentiation makes things worse
  • Global-optimization / Derivative-free optimization typically uses other approaches then those found in scipy
    • Often zero-order methods (not gradient-based)
    • Often introducing some proxy-loss / assumptions like RBF-kernels and co.
    • They are much more efficient in terms of function-calls and often are also designed not to break in case of non-smoothness and non-deterministic behavior
    • But: Global-optimization / Derivative-free optimization in general is even harder than smooth deterministic optimization (which is already hard in theory)

So despite the fact that there are not much specifics in your question (details / assumptions matter!), scipy is not the most useful library to achieve what you want! There are survey-papers (example) about global-/derivative-free optimization outlining alternatives. One example (arbitrarychosen; open-source) would be CoinOR's RBFOpt where it's page summarizes our problems from above:

It does not assume that f(x) is known in analytical form: f(x) is simply a black-box that, given input values, produces output values. The bounds on the variables x_L, x_U are assumed to be finite. RBFOpt is especially targeted at problems for which each evaluation of the objective function f(x) is expensive (in terms of computing time, or cost, or some other measure) and we want to find a global minimum of the function with as few function evaluations as possible. Since this is a very difficult class of problems (we do not assume availability of first order derivatives), RBFOpt works best on problems that are relatively small dimensional (up to 20 variables, ideally less than 10) and for which the bounding box is not too large.

sascha
  • 32,238
  • 6
  • 68
  • 110
  • First, thanks for your time. I'm not not a programmer by profession.Your answer: "The first obvious thing is the process-based call of your simulation which induces overhead. If your simulation is dominating in regards to time, this is less relevant, but in this case it's the wrong approach in general".What is the process based call (example or links)in general and with respect to my problem? Can you suggest me a better way to put my function definition? Each simulation will cost me not more than 5 seconds. Y variables in my objective function are the ones I need from my simulation. Thanks – Janson 7 Feb 21 '18 at 07:54
  • Instead of call to another python running the sim, import it and run in the same interpreter. But this is imho the least of your problems (and won't help much if 5 secs are needed each time). – sascha Feb 21 '18 at 12:28