1

I am familiar with some of the functions in scipy.optimize.optimize and have in the past used fmin_cg to minimize a function where I knew the derivative. However, I now have a formula which is not easily differentiated.

Several of the functions in that module (fmin_cg, for instance) do not actually require the derivative to be provided. I assume that they then calculate a quazi-derivative by adding a small value to each of the parameters in turn - is that correct?

My main question is this: Which of the functions (or one from elsewhere) is the best to use when minimising a function over multiple parameters with no given derivative?

thornate
  • 4,902
  • 9
  • 39
  • 43
  • You are correct about the method of approximating the derivative (Jacobian) numerically. I have seen this when using `scipy.optimize.leastsq`. – wim Mar 08 '12 at 05:51
  • Just curious, how many variables do you have ? – denis Mar 14 '12 at 10:48

2 Answers2

3

I'm not too familiar with what's available in SciPy, but the Downhill Simplex method (aka Nelder-Mead or the Amoeba method) frequently works well for multidimensional optimization.

Looking now at the scipy documentation, it looks like it is available as an option in the minimize() function using the method='Nelder-Mead' argument.

Don't confuse it with the Simplex (Dantzig) algorithm for Linear Programming...

Drew Hall
  • 28,429
  • 12
  • 61
  • 81
  • Also, as [fmin](http://docs.scipy.org/doc/scipy-0.10.1/reference/generated/scipy.optimize.fmin.html#scipy.optimize.fmin) function – pv. Mar 11 '12 at 11:17
3

Yes, calling any of fmin_bfgs fmin_cg fmin_powell as

fmin_xx( func, x0, fprime=None, epsilon=.001 ... )

estimates the gradient at x by (func( x + epsilon I ) - func(x)) / epsilon.
Which is "best" for your application, though, depends strongly on how smooth your function is, and how many variables.
Plain Nelder-Mead, fmin, is a good first choice -- slow but sure; unfortunately the scipy Nelder-Mead starts off with a fixed-size simplex, .05 / .00025 regardless of the scale of x.

I've heard that fmin_tnc in scipy.optimize.tnc is good:

fmin_tnc( func, x0, approx_grad=True, epsilon=.001 ... )  or
fmin_tnc( func_and_grad, x0 ... )  # func, your own estimated gradient

(fmin_tnc is ~ fmin_ncg with bound constraints, nice messages to see what's happening, somewhat different args.)

denis
  • 21,378
  • 10
  • 65
  • 88