0

Some background:

I am using the Nelder-Mead simplex optimization algorithm from scipy.optimize.minimize to do some hyperparameter optimization on a deep learning model. For an input x and a function f, minimize is trying to optimize the function value f(x) by changing x. In my case, x are the hyperparameters of the model, f, which is making predictions for a constant set of training examples.

Because f is very large, and there are many training examples, each call f(x) takes about 10 minutes when the embarrassingly parallel problem of making predictions on all training examples is distributed among 20 rtx-2080 GPUs. So every step is expensive.

For one reason or another (crashing, running out of time on the GPUs) the script will stop in the middle of optimizing. It is therefore desirable for me to save the state of the optimization so I can continue it from where it left off. I am able to save the hyperparameter values x during every N.M. step, but this only goes so far. Even if you've recovered x (let's call the recovered version x'), the Nelder-Mead simplex is lost. If you restart optimization of f at x', minimize has to rebuild the simplex by evaluating f(x' + p) N times, where p is some perturbation to one of the dimensions of x , and N is either dim(x) or dim(x) + 1. In my case, x is high dimensional (>20), so it takes ~3 hours just to recover the simplex.

The question:

I need a way to access the simplex at every step in case of a crash. Others have suggested using a callback to solve the problem of recovering parameter and function values during optimization with scipy.optimize.minimize (not necessarily with Nelder-Mead). However, in the documentation for minimize it states the only version of minimize that can return both the current parameter values (x) and an OptimizeResult object (an object which could contain the simplex in the case of N.M.) via a callback is the "trust_constr" method. In other words, I'm pretty sure using a callback with the "nelder_mead" version only gives you access to x.

The best solution we've come up with is to edit the _minimize_neldermead function within the minimize source code to save the simplex values after each step. Is there a more elegant way of accomplishing this? Does minimize already have this ability and we just can't find it?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • I did not analyze everything in detail, but assuming, that we are working with a black-box model here: *the function-value sequence is the only input*, i don't see much of a problem: store your x-vectors in each iteration (not just the last), also the output and when there is a need for continuation: re-simulate your *optimization path* by re-using your cached values without touching your backend, e.g. by counting function evals and comparing the count against the number of values in the cache -> use cache if available, pass-through to your backend if not (=re-calc internals with cached path). – sascha Jul 31 '20 at 20:55
  • Actually, assuming determinism in regards to scipy's internals, you won't even need to store all the x-vectors, but just the initial vector and the function-eval results. This will be enough to recover every internal bit. – sascha Jul 31 '20 at 20:59
  • I agree with the answer given in the above comments by sascha. It is just necessary to write the function f(x) so that it can make a record of the argument x and value f(x) every time it is called. The state of a Nelder Mead minimization after each iteration is the simplex and the values at each vertex. However, each iteration can require from 1 to N+1 evaluations of f(x), so the state after each evaluation is more complicated. Fortunately, it is not necessary to preserve the state if you can rerun the optimization up to the point of failure using saved evaluations. – 10ppb Dec 08 '21 at 22:42

0 Answers0