0

I am new with Octave.Now I am trying to implement steepest descent algorithm in Octave.

For example minimization of f(x1,x2) = x1^3 + x2^3 - 2*x1*x2

  1. Estimate starting design point x0, iteration counter k0, convergence parameter tolerence = 0.1. Say this staring point is (1,0)

  2. Compute gradient of f(x1,x2) at the current point x(k) as grad(f). I will use numerical differentiation here.

    d/dx1 (f) = lim (h->0) (f(x1+h,x2) - f(x1,x2) )/h
    

    This is grad(f)=(3*x1^2 - 2*x2, 3*x2^2 - 2*x1)

    grad(f) at (0,1) is c0 = (3,-2)

  3. since L2 norm of c0 > tolerence, we proceed for next step

  4. direction d0 = -c0 = (-3,2)

  5. Calculate step size a. Minimize f(a) = f(x0 + a*d0) = (1-3a,2a) = (1-3a)^3 + (2a)^3 - 2*(1-3a)*(2a). I am not keeping constant step size.

  6. update: new[x1,x2] = old[x1,x2]x + a*d0.

Everything is fine until step 5. I don't know how to implement an equation , or directly get a minimum value of an equation in Octave . How to do it ?

Edit How can we use steepest descent with this convex function : f(x, y) = 4x^2 − 4xy + 2y^2

voxter
  • 853
  • 2
  • 14
  • 30

1 Answers1

0

Well the short answer is "you cannot". Not in general, at least. Optimization problem in line 5 is hard on its own. Even though it is one dimensional, you are optimizing over highly non convex function. Thus what you can do is run a log-scale line searcch (just sampling bigger and bigger steps and choosing a minimum) or running another "nested" optimizer, like a regular SD with fixed step size on the problem in step 5 or you can approximate this function with something simpler (like 2 order polynomial) and go directly to the optimum of the approximation. Whichever way you choose - it will be the approximation. There is no way to get to actual minimum. The only ability to solve it directly is to still work on symbolic level, and if the function is simple enough (like polynomial of small degree) you can find its extremas analytically.

enter image description here

As a side note - your whole optimization problem is ill defined, unless there are some constraints you are not talking about. It's minimum is in (-infty, -infty) (it decreases indifinitely).

enter image description here

lejlot
  • 64,777
  • 8
  • 131
  • 164
  • Well, but if our function is convex , we still can find minimum value , like this : http://www.math.usm.edu/lambers/mat419/lecture10.pdf. – voxter Jul 31 '16 at 12:45
  • Please look at their examples . The problem is i don't know how to implement it ( step 5 ) in Octave because it is an equation with a symbolic , – voxter Jul 31 '16 at 12:48
  • your function is not convex. plus as stated in the answer - whole optimization problem is flawed as it will diverge to -infinity, whatever you implement (your optimization is unbounded) – lejlot Jul 31 '16 at 12:52
  • If our function is f(x, y) = 4x^2 − 4xy + 2y^2 ( like their example ) , can we do ?? – voxter Jul 31 '16 at 12:55
  • If the function is simple enough, that narrowed to the direction of the gradient (step 5) has **an easy minimum to find** then step 5 is simply to find it. If it is quadratic - you have an equation for the minimum of the parabola - apply it. If it is not quadratic, yet still convex, you need another optimizer to run in a nested fashion (like said in the answer - GD with fixed step size or even some simple "binary search"). – lejlot Jul 31 '16 at 12:58
  • In their example , they minimize function : `f((2, 3) − t(4, 4))` by find t which its derivative=0 .It is an equation with symbolic , and i don't know how to do it in octave. How can we write an equation , and find its roots ? – voxter Jul 31 '16 at 13:11
  • there is nothing "symbolic" here, t is just a name for the argument. You compute the derivatives the same way as before, but instead of "x1" and "x2" your variable is "t", that's all. Octave will not do it for you. You can think about it as a recursive call to your own minimizer. In practise noone does this kind of things, as this rquires complex mathematical analysis, which is problem-specific, you cannot really have a single general scheme with such precision. – lejlot Jul 31 '16 at 13:13