6

I am trying to minimize a cost function using scipy.optimize.minimize, but it is very slow. My function has close to 5000 variables, and so it is not surprising that scipy is slow. However, if there is a parallel version of scipy.optimize.minimize it might help a great deal.

I was wondering if such version of scipy.optimize.minimize exists or if there is any other scipy/numpy tool available for performing minimization of this magnitude. I really appreciate any and all help.

Thanks everybody for their comments. This is a constrained minimization using the SLSQP solver. I have already spent a lot of time making sure that the cost function calculation is optimized, so the problem must be in calculation of the gradient or due to constraints. In other words, the amount of time that is spent on function evaluations is a very small fraction of the total time spent on the minimization.

NPL
  • 69
  • 1
  • 3
  • Way too broad! It's important to understand the bottleneck (Which cost functions? Which kind of problem? Unconstrained? Variable-Bounded? Constrained? Which optimizer? Line-searches? Which kind of gradient/hessian calculation? Zero-order, First-oder, Second-order... something in-between like quasi-newton) and also know what kind of speedup parallelization can achieve. What kind of parallelization? Multiprocessing? This would be at most 4 times faster on my system... wow... i would not bother to touch it; but tune my minimization-approach. As stated, there is no useful answer to this question! – sascha Dec 16 '17 at 23:35
  • A big unknown is your objective function. Is it as fast as it can be, given the number of variables? Is it meaningful to pass it a subset of those variables? Could you pass it multiple 'points' (combinations of those 5000 variables)? – hpaulj Dec 16 '17 at 23:54
  • Welcome to StackOverflow. Community rules define, that a high quality question should always explain both **[A] : one's target** ( define what one considers as an acceptable solution ), best in a format of an MCVE - **M**inimum **C**omplete **V**erifiable **E**xample of such a code + data, that live demonstrate the **P**roblem-**u**nder-**R**eview. **[B] :** but also tell others all the efforts -- **What Have You Tried So Far** -- i.e. what one has spent so far to analyse & test it -- So, you might want to rather update your post, so as to better meet the Community Rules and Best Practices. – user3666197 Dec 17 '17 at 05:28
  • FYI, having used **`scipy.optimize`** tools for QuantFX Machine Learning on about ~ **`4.760.000`** synapses sized Neural Networks, the problem is neither in sizing ( `5.000` being ~ 1000x smaller ) nor the library itself, but how to use it both efficiently and at its most powers. Anyway, one can always read the `scipy.optimise` source code to learn all details, if & when necessary. Undoubtedly, very good lessons on software engineering hidden there ( with heaps of FORTRAN legacy being still included there ( as of EoY-2017 ) ... Panta Rhei ). – user3666197 Dec 17 '17 at 05:34
  • Perhaps [this](https://stackoverflow.com/questions/13706624/parallel-many-dimensional-optimization) question could help? – Daniel Dec 17 '17 at 14:46

1 Answers1

3

We implemented a parallel version of scipy.optimize.minimize(method='L-BFGS-B') in the package optimparallel available on PyPI. It can speedup the optimization by evaluating the objective function and the (approximate) gradient in parallel. I have not tested it with 5000 parameters. But, with fewer parameters we observe good parallel scaling.

The Python package is a Python implementation of the R package optimParallel. The method is documented in this R journal article.

Here is an illustration of the possible parallel scaling: enter image description here

Nairolf
  • 2,418
  • 20
  • 34