9

I'm looking to run a gradient descent optimization to minimize the cost of an instantiation of variables. My program is very computationally expensive, so I'm looking for a popular library with a fast implementation of GD. What is the recommended library/reference?

Prashant Kumar
  • 20,069
  • 14
  • 47
  • 63
Jim
  • 4,509
  • 16
  • 50
  • 80
  • What would be _fast_? Non-exact? Cached answers for previous queries? Or some other kind of criteria? – sarnold Jul 16 '12 at 23:14
  • @sarnold True. Non-exact is okay, I don't need to hit the global optimum. I just want something that can quickly achieve results that are better than random search :) I'd like to play with the time I allow it to run to see the time/improvement tradeoff. – Jim Jul 16 '12 at 23:19
  • Why is your implementation slow? – Jacob Jul 16 '12 at 23:28
  • @Jacob because the act of computing the cost for an instantiation involves expensive image manipulations across a large database :) – Jim Jul 16 '12 at 23:37
  • 2
    If you want fast, you don't want gradient descent. Try something more sophisticated with GD as fallback. What is best depends a lot on the structure of your problem though. Options include conjugate gradient, biconjugate gradient, or my personal favorite if you;re doing data fitting Levenberg-Marquardt - the list is as long as your arm. – Michael Anderson Jul 17 '12 at 00:52
  • The techniques that @MichaelAnderson suggests are excellent, especially if you have analytical expressions for your gradient, and your cost function and its gradients are continuous. If you're gradients are numerical, then you may be better off with downhill simplex. Each numerical gradient costs 2*N (N=number of variables) to calculate. This can easily make derivative approaches less efficient than non-derivative approaches. – sfstewman Jul 18 '12 at 03:20

4 Answers4

10

GSL is a great (and free) library that already implements common functions of mathematical and scientific interest.

You can peruse through the entire reference manual online. Poking around, this starts to look interesting, but I think we'd need to know more about the problem.

Prashant Kumar
  • 20,069
  • 14
  • 47
  • 63
  • 1
    Those alternate algorithms listed in the GSL reference are conjugate/biconjugate gradient methods, and should give better performance than gradient descent so long as your data is "well behaved". – Michael Anderson Jul 17 '12 at 00:55
  • 2
    And if you're numerically calculating your derivatives from function values then you'd probably want this instead: http://www.gnu.org/software/gsl/manual/html_node/Multimin-Algorithms-without-Derivatives.html – Michael Anderson Jul 17 '12 at 00:57
  • This looks like a really good answer, but I'm having a really hard time getting this going with VS2010 (picky, I know...) – Jim Jul 17 '12 at 02:48
  • I think it's a fairly reasonable picky :) Try this: [link](http://www.quantcode.com/modules/smartfaq/faq.php?faqid=94) – Prashant Kumar Jul 17 '12 at 02:54
  • Otherwise, I think the CPP files can simply be included in whatever project you may be working on. – Prashant Kumar Jul 17 '12 at 02:57
5

It sounds like you're fairly new to minimization methods. Whenever I need to learn a new set of numeric methods, I usually look in Numerical Recipes. It's a book that provides a nice overview of the most common methods in the field, their tradeoffs, and (importantly) where to look in the literature for more information. It's usually not where I stop, but it's often a helpful starting point.

For example, if your function is costly, then your goal is to minimization the number of evaluations to need to converge. If you have analytical expressions for the gradient, then a gradient-based method will probably work to your advantage, assuming that the function and its gradient are well-behaved (lack singularities) in the domain of interest.

If you don't have analytical gradients, then you're almost always better off using an approach like downhill simplex that only evaluates the function (not its gradients). Numerical gradients are expensive.

Also note that all of these approaches will converge to local minima, so they're fairly sensitive to the point at which you initially start the optimizer. Global optimization is a totally different beast.

As a final thought, almost all of the code you can find for minimization will be reasonably efficient. The real cost of minimization is in the cost function. You should spend time profiling and optimizing your cost function, and select an algorithm that will minimize the number of times you need to call it (methods like downhill simplex, conjugate gradient, and BFGS all shine on different kinds of problems).

In terms of actual code, you can find a lot of nice routines at NETLIB, in addition to the other libraries that have been mentioned. Most of the routines are in FORTRAN 77, but not all; to convert them to C, f2c is quite useful.

Dharmin B
  • 3
  • 1
  • 3
sfstewman
  • 5,589
  • 1
  • 19
  • 26
  • 1
    An additional note on using fortran functions is that its usually pretty easy to link C and fortran code together. Especially well structured fortran libraries. – Michael Anderson Aug 30 '13 at 01:22
  • I'm a junior engineer and never heard about the downhill simplex, teachers were always talking about gradients, but this looks just amazing (having a huge C++/simulation background) so I have to thanks you for the sake of serendipidity :) – Cevik Jan 09 '22 at 22:24
  • Would you say that combining a monte carlo for starting points and a downhill simplex such that the initial step is half the average distance of two points of the point cloud is the way to go with global optimization in general ? – Cevik Jan 09 '22 at 22:29
4

One of the best respected libraries for this kind of optimization work is the NAG libraries. These are used all over the world in universities and industry. They're available for C / FORTRAN. They're very non-free, and contain a lot more than just minimisation functions - A lot of general numerical mathematics is covered.

Anyway I suspect this library is overkill for what you need. But here are the parts pertaining to minimisation: Local Minimisation and Global Minimization.

Michael Anderson
  • 70,661
  • 7
  • 134
  • 187
  • 1
    This library also looks really good, but the "very non-free" part makes me a little leery about using it – Jim Jul 17 '12 at 02:49
2

Try CPLEX which is available for free for students.

Jacob
  • 34,255
  • 14
  • 110
  • 165