0

I have to implement the steepest descent method and test it on functions of two variables, using Matlab. Here's what I did so far:

x_0 = [0;1.5]; %Initial guess
alpha = 1.5; %Step size
iteration_max = 10000;
tolerance = 10e-10;

% Two anonymous function to compute 1st and 2nd entry of gradient
f = @(x,y) (cos(y) * exp(-(x-pi)^2 - (y-pi)^2) * (sin(x) - 2*cos(x)*(pi-x)));
g = @(x,y) (cos(x) * exp(-(x-pi)^2 - (y-pi)^2) * (sin(y) - 2*cos(y)*(pi-y)));

%Initiliazation
iter = 0;
grad = [1; 1]; %Gradient

while (norm(grad,2) >= tolerance)
    grad(1,1) = f(x_0(1), x_0(2));
    grad(2,1) = g(x_0(1), x_0(2));
    x_new = x_0 - alpha * grad; %New solution
    x_0 = x_new %Update old solution
    iter = iter + 1;

    if iter > iter_max
        break
    end
end

The problem is that, compared to for example the results from WolframAlpha, I do not obtain the same values. For this particular function, I should obtain either (3.14,3.14) or (1.3,1.3) but I obtain (0.03, 1.4).

wrong_path
  • 376
  • 1
  • 6
  • 18

1 Answers1

0

You should know that this method is a local search and thus it can stuck in local minimum depending on the initial guess and step size.

  • With a different initial guess, it will find a different local minimum.

  • Step size is important because a big stepsize can prevent the algorithm from converging. A small stepsize makes the algorithm really slow. This is why you should adapt the size of the steps as the function value decreases.

Always it is a good idea to understand the function you want to optimize by plotting it (if possible). The function you are working with looks like this (in range [-pi pi]):

enter image description here

with the following parameter values you will get to the local minimum you are looking for.

x_0 = [2;2]; %Initial guess
alpha = 0.5; %Step size
NKN
  • 6,482
  • 6
  • 36
  • 55
  • Thanks a lot. Do you know how could I show, visually with a figure, the convergence (or divergence) of the method? I was thinking of plotting the iteration vs. norm of the gradient. – wrong_path Mar 03 '17 at 06:30
  • 1
    You need to have the functions that the gradients are calculated based on. Consider they are `F` and `G`, then at each point `x` you can make `J = 0.5*(F^2+G^2)`. Plotting `J` over `iter` shows you the convergence of the algorithm. – NKN Mar 03 '17 at 06:38