Using Strong Wolfe Condition to find learning rate

Question

I'm trying to optimize a function with gradient decent code that I wrote but I want to use another function file related to Strong Wolfe Condition to find a good alpha.

Both of my codes are working with an equation and same time I get -inf for another equation. -inf occures in con1 and con3 in Strong Wolfe Condition function.

I have no idea why with the second equation I can't find the Alpha. please help me with this.

My codes are wroten in Matlab.

This is my gradient descent code:

% Clear the space
clear all;
clc;

% Define the equation
syms x1 x2;
% f(x1, x2) = (-x1 ^ 3) * exp(x2 - (x1 ^ 2) - (10 * ((x1 - x2) ^ 2))); % First equation
f(x1, x2) = -cos(x1) * (cos(x2) * (exp((x1-pi)^2+(x2-pi)^2))); % Second equation


% plot the contour plot
fcontour(f, 'Fill', 'on');
% To show the color bar next to the plot
colorbar
% For 3d plot //// I just coment it because it doesn't look good
% fsurf(f);
% To hold the plot so we can show the update of point on it
hold on

% The diviation of equation
gx = gradient(f);

% Initial point given by homework
% x = [0.75, -1.25];
x = [2.1, 2.1];

% Learing Rate (alpha)
% learningRate = LR(x, f);
learningRate = LR(x, f);

% First compute of gradient with initial points
xResult = double(subs(gx(x(1, 1), x(1, 2)),{x1,x2},{x(1,1),x(1,2)}));

% Two empty list to save the points during the update
x1List = [x(1, 1)];
x2List = [x(1, 2)];

% for loop for update the points
for i = 1:1000
    
    % with GD formula update the points
    x(1, 1) = x(1,1) - (learningRate * xResult(1, 1));
    x(1, 2) = x(1,2) - (learningRate * xResult(2, 1));
    
    % Save the points inside the empty list for further use
    x1List = [x1List x(1, 1)]; %#ok
    x2List = [x2List x(1, 2)]; %#ok
    
    % With new points compute the gradient
    xResult = double(subs(gx(x(1, 1), x(1, 2)),{x1,x2},{x(1,1),x(1,2)}));
    
    % Plot the point on contour 
%     plot(x(1, 1), x(1, 2), 'r-*');
%     title('Iteration = ', i )
%     pause(0.1)
    
    % Break the iteration
    % Check if the points don't change more than 10e-4 and if so break
    errorAmountx1 = x1List(1, i)  - x(1, 1);
    errorAmountx2 = x2List(1, i)  - x(1, 2);
    if (errorAmountx1 < 10e-4 && errorAmountx2 < 10e-4)
        break
    end

%     if (mod(i, 5) == 0)
%         learningRate = LR_2(x, f);
%     end
end

% After finding the min point print out the results
disp('Iteration = ');
disp(i);
disp('The Min Point = ');
disp(x);
disp('Final x in equation : ')
fResult = double(subs(f(x(1,1), x(1,2)),{x1,x2},{x(1,1),x(1,2)}));
disp(fResult);
plot(x1List, x2List, 'r-*');

This is my Strong Wolfe Condition code:

function [learningRate] = LR(x, f)
    syms x1 x2;
    f(x1, x2) = f;
    
    lr = 0.00001:0.001:1;
    
    c1 = rand();
    c2 = c1 + rand() * (1 - c1);

    gradi = double(subs(gradient(f, [x1, x2]),{x1,x2},{x(1,1),x(1,2)}));
    p = -gradi;
    
    for i = 1:length(lr)
        disp(i);
        
        xap = x + lr(i) * transpose(p);
    
        con1 = double(subs(f(xap(1,1), xap(1,2)),{x1,x2},{xap(1,1), xap(1,2)}));
        con2 = double(subs(f(x(1,1), x(1,2)),{x1,x2},{x(1,1),x(1,2)})) + (c1 * (lr(i) * transpose(gradi)) * p);
        con3 = abs(transpose(double(subs(gradient(f, [x1, x2]),{x1,x2},{xap(1,1),xap(1,2)}))) * p);
        con4 = c2 * abs(transpose(gradi) * p);
    
        if (con1 <= con2 && con3 <= con4)
            learningRate = lr(i);
            break
        end
    end   
end

Using Strong Wolfe Condition to find learning rate

0 Answers0