scipy.optimize.minimize('SLSQP') too slow when given 2000 dim variable

Question

I have a non-lenear optimization problem with a constraint and upper/lower bounds, so with scipy I have to use SLSQP. The problem is clearly not convex. I got the jacobian fo both the objective and constraint functions to work correctly (results are good/fast up to 300 input vector). All functions are vectorized and tuned to run very fast. The problem is that using 1000+ input vector takes ages though I can see the minimizer is not calling my functions a lot (objective/constraint/gradients) and seems to spend most of its processing time internally. I read somewhere perf of SLSQP is O(n^3).

Is there a better/faster SLSQP implementation or another method for this type of problem for python ? I tried nlopt and somehow returns wrong results given the exact same functions I use in scipy (with a wrapper to adapt to its method signature). I also failed to use ipopt with pyipopt package, cannot get working ipopt binaries to work with the python wrapper.

UPDATE: if it helps, my input variable is basically a vector of (x,y) tuples or points in 2D surface representing coordinates. With 1000 points, I end up with a 2000 dim input vector. The function I want to optimize calculates optimum position of the points between each other taking into consideration their relationships and other constraints. So the problem is not sparse.

Thanks...

Sandia has bunch of solvers for optimization in dakota (https://dakota.sandia.gov/). Dakota can interface with Python; that is, you can pass your parameters to Dakota from Python and return the results. Here is a [link](https://dakota.sandia.gov/content/about) to the summary about Dakota. — dustin, Nov 27 '17 at 06:44
I'm also sure SLSQP is using BFGS internally. That means some heavy-calculations (at least matrix-vector) in each iteration using a matrix of size N*N, where you want to use 20000 * 20000. — sascha, Nov 27 '17 at 12:14

Erwin Kalvelagen · Answer 1 · 2017-11-27T13:46:48.657

We don't know much about the model, but here are some notes:

SLSQP is really designed for small (dense), well-scaled models.
SLSQP is a local solver. It will accept non-convex problems but will only provide local solutions.
I doubt there is this kind of a complexity bound for SLSQP. Anyway, it does not say much about the performance of a particular problem.
IPOPT is a large-scale, sparse interior point solver. It will find local solutions. It can solve really large models.
Global solvers like BARON, ANTIGONE and COUENNE find global solutions (or bounds on the quality of the solution if you are impatient and stop prematurely). These solvers are (most of the time) slower than local solvers. I am not aware of direct Python links.
If you have a good starting point, a local solver may be just what you need. Using a multistart strategy can help finding better solutions (not proven globally optimal but you get some confidence you have not found a really bad local optimum).
The Python framework PYOMO offers access to many solvers. However you will need to rewrite the model. PYOMO has automatic differentiation: no need to provide gradients.
For testing you can try to transcribe the model in AMPL or GAMS and solve online via NEOS. This will allow you to try out a number of solvers. Both AMPL and GAMS have automatic differentation.

Thanks. Frankly, I used SLSQP as a starting point to fine tune the cost/constraint functions and their jacodians. But it does not scale. My intended use is 20k+ input variables, with the need for real-time optimization when introducing a small perturbation on the already optimized solution (so first run can take some time, it is ok, but not ages). — Riad Souissi, Nov 27 '17 at 12:03
Many solvers (especially active set methods) should do this quite gracefully especially if you can keep things feasible. In the process industry these type of online algorithms are not uncommon. — Erwin Kalvelagen, Nov 27 '17 at 12:16

score 6 · Answer 2 · answered Aug 12 '18 at 12:31

In my opinion scipy.minimze provides an intuitive interface for optimization. I've found that speeding up the cost function (and eventually the gradient function) can give you nice speedups.

For example take the N-Dimensional Rosenbrock function:

import numpy as np
from scipy.optimize import minimize


def rosenbrock(x, N):
    out = 0.0
    for i in range(N-1):
        out += 100.0 * (x[i+1] - x[i]**2)**2 + (1 - x[i])**2
    return out

# slow optimize
N = 20
x_0 = - np.ones(N)
%timeit minimize(rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4})
res = minimize(rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4})
print(res.message)

Timing of the optimization yields

102 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Optimization terminated successfully.

Now you could speed up the objective function using numba, and provide a naive function to compute the gradient like this:

from numba import jit, float64, int64

@jit(float64(float64[:], int64), nopython=True, parallel=True)
def fast_rosenbrock(x, N):
    out = 0.0
    for i in range(N-1):
        out += 100.0 * (x[i+1] - x[i]**2)**2 + (1 - x[i])**2
    return out


@jit(float64[:](float64[:], int64), nopython=True, parallel=True)
def fast_jac(x, N):
    h = 1e-9
    jac = np.zeros_like(x)
    f_0 = fast_rosenbrock(x, N)
    for i in range(N):
        x_d = np.copy(x)
        x_d[i] += h
        f_d = fast_rosenbrock(x_d, N)
        jac[i] = (f_d - f_0) / h
    return jac

Which is basically just adding a decorator to the objective function, allowing parallel computation. Now we can time the optimization again:

print('with fast jacobian')
%timeit minimize(fast_rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4}, jac=fast_jac)
print('without fast jacobian')
%timeit minimize(fast_rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4})
res = minimize(fast_rosenbrock, x_0, args=(N,), method='SLSQP', options={'maxiter': 1e4}, jac=fast_jac)
print(res.message)

Trying both, with and without providing the fast jacobian function. The output of this is:

with fast jacobian
9.67 ms ± 488 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
without fast jacobian
27.2 ms ± 2.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Optimization terminated successfully.

Which is an about 10 times speedup for little effort. The improvement you can achieve with this is highly dependent on the inefficiency of your cost function. I had a cost function with several calculations in it and was able to get speedups around 10^2 - 10^3.

The advantage of this approach is its little effort and that you can stay with scipy and its nice interface.

The author clearly mentioned that "the minimizer is not calling my functions a lot (objective/constraint/gradients)". Your suggestion to speed-up the objective/gradient computations, though interesting, is off-topic. Moreover, your code is tested in dimension `N=20` which is not the same setting as the OP ("1000+ input vector"). — paulduf, Oct 05 '21 at 13:45

score 1 · Answer 3 · answered Dec 02 '17 at 13:34

Surprisingly, I found a relatively ok solution using an optimizer from for a deep learning framework, Tensorflow, using basic gradient descent (actually RMSProp, gradient descent with momentum) after I changed the cost function to include the inequality constraint and the bounding constraints as penalties (I suppose this is same as lagrange method). It trains super fast and converges quickly with proper lambda parameters on the constraint penalties. I didn't even have to rewrite the jacobians as TF takes care of that without much speed impact apparently.

Before that, I managed to get NLOPT to work and it is much faster than scipy/SLSQP but still slow on higher dimensions. Also NLOPT/AUGLANG is super fast but converges poorly.

This said, at 20k variables, it is stillslow. Partly due to memory swapping and the cost function being at least O(n^2) from the pair-wise euclidean distance (I use (x-x.t)^2+(y-y.t)^2 with broadcasting). So still not optimal.

"relatively ok solution" to which problem? I'm curious to know — paulduf, Mar 24 '22 at 10:31

scipy.optimize.minimize('SLSQP') too slow when given 2000 dim variable

3 Answers3

Linked