Why would my minimum search (steepest decent) climb a hill?

Question

I'm trying to minimize a discretized function using a method of steepest decent. This should be fairly straightforward, but I'm having trouble with the search 'climbing' out of any local minimum. Here's my code in Mathematica, but its syntax is easy to follow.

x = {some ordered pair as a beginning search point};
h = 0.0000001; (* something rather small *)
lambda = 1;
While[infiniteloop == True,
  x1 = x[[1]];
  x2 = x[[2]];
  x1Gradient = (f[x1-h,x2]-f[x1+h,x2])/(2h);
  x2Gradient = (f[x1,x2-h]-f[x1,x2+h])/(2h);
  gradient = {x1Gradient,x2Gradient};

  (* test if minimum is found by normalizing the gradient*)
  If[Sqrt[x1Gradient^2 + x2Gradient^2] > 0.000001,
    xNew = x + lambda*g,
    Break[];
  ];

  (* either pass xNew or reduce lambda *)
  If[f[xNew[[1]],xNew[[2]]] < f[x1,x],
    x = xNew,
    lambda = lambda/2;
  ];
];

Why would this ever climb a hill? I'm puzzled because I even test if the new value is less than the old. And I don't pass it when it is! Thoughts?

score 3 · Answer 1 · answered Jul 06 '11 at 06:27

From the Unconstrained Optimization Tutorial, p.4: (available at: http://www.wolfram.com/learningcenter/tutorialcollection/)

"Steepest descent is indeed a possible strategy for local minimization, but it often does not converge quickly. In subsequent steps in this example, you may notice that the search direction is not exactly perpendicular to the contours. The search is using information from past steps to try to get information about the curvature of the function, which typically gives it a better direction to go. Another strategy, which usually converges faster, but can be more expensive, is to use the second derivative of the function. This is usually referred to as Newton's" method."

To me, the idea seems to be that 'going the wrong way' helps the algorithm learn the 'right way to go'---and provides useful information on the curvature of your function to guide subsequent steps.

HTH... If not, have a look at the Constrained and Unconstrained Tutorials. Lot's of interesting info.

score 2 · Answer 2 · answered Jul 06 '11 at 08:13

2

Your gradient is negative. Use

 x1Gradient = (f[x1+h,x2]-f[x1-h,x2])/(2h);
 x2Gradient = (f[x1,x2+h]-f[x1,x2-h])/(2h);

answered Jul 06 '11 at 08:13

whoplisp

2,508
16
19

Alright, but I take care of that because I add the gradient to the previous point. – John Matok Jul 07 '11 at 00:16
The wrong sign would still be a problem. You go in the wrong direction. Your method walks uphill. – whoplisp Jul 07 '11 at 00:51
It's simply a sign difference. Because I have a negative gradient, I just add it to the old point to get the new point. Say the slope is truly positive. I'd want to move in the negative direction. So I change the sign of the slope and add it to the point to get the next point. – John Matok Jul 07 '11 at 14:38

score 1 · Answer 3 · answered Jul 06 '11 at 09:57

1

Steepest descent gets stuck in local optima, enable a tabu search aspect on it to not get stuck in local optima.

See this book for example algorithms of steepest ascent (= steepest descent) and tabu search.

answered Jul 06 '11 at 09:57

Geoffrey De Smet

26,223
11
73
120

Why would my minimum search (steepest decent) climb a hill?

3 Answers3