I'm trying to optimize a function with gradient decent code that I wrote but I want to use another function file related to Strong Wolfe Condition to find a good alpha.
Both of my codes are working with an equation and same time I get -inf
for another equation.
-inf
occures in con1
and con3
in Strong Wolfe Condition function.
I have no idea why with the second equation I can't find the Alpha. please help me with this.
My codes are wroten in Matlab.
This is my gradient descent code:
% Clear the space
clear all;
clc;
% Define the equation
syms x1 x2;
% f(x1, x2) = (-x1 ^ 3) * exp(x2 - (x1 ^ 2) - (10 * ((x1 - x2) ^ 2))); % First equation
f(x1, x2) = -cos(x1) * (cos(x2) * (exp((x1-pi)^2+(x2-pi)^2))); % Second equation
% plot the contour plot
fcontour(f, 'Fill', 'on');
% To show the color bar next to the plot
colorbar
% For 3d plot //// I just coment it because it doesn't look good
% fsurf(f);
% To hold the plot so we can show the update of point on it
hold on
% The diviation of equation
gx = gradient(f);
% Initial point given by homework
% x = [0.75, -1.25];
x = [2.1, 2.1];
% Learing Rate (alpha)
% learningRate = LR(x, f);
learningRate = LR(x, f);
% First compute of gradient with initial points
xResult = double(subs(gx(x(1, 1), x(1, 2)),{x1,x2},{x(1,1),x(1,2)}));
% Two empty list to save the points during the update
x1List = [x(1, 1)];
x2List = [x(1, 2)];
% for loop for update the points
for i = 1:1000
% with GD formula update the points
x(1, 1) = x(1,1) - (learningRate * xResult(1, 1));
x(1, 2) = x(1,2) - (learningRate * xResult(2, 1));
% Save the points inside the empty list for further use
x1List = [x1List x(1, 1)]; %#ok
x2List = [x2List x(1, 2)]; %#ok
% With new points compute the gradient
xResult = double(subs(gx(x(1, 1), x(1, 2)),{x1,x2},{x(1,1),x(1,2)}));
% Plot the point on contour
% plot(x(1, 1), x(1, 2), 'r-*');
% title('Iteration = ', i )
% pause(0.1)
% Break the iteration
% Check if the points don't change more than 10e-4 and if so break
errorAmountx1 = x1List(1, i) - x(1, 1);
errorAmountx2 = x2List(1, i) - x(1, 2);
if (errorAmountx1 < 10e-4 && errorAmountx2 < 10e-4)
break
end
% if (mod(i, 5) == 0)
% learningRate = LR_2(x, f);
% end
end
% After finding the min point print out the results
disp('Iteration = ');
disp(i);
disp('The Min Point = ');
disp(x);
disp('Final x in equation : ')
fResult = double(subs(f(x(1,1), x(1,2)),{x1,x2},{x(1,1),x(1,2)}));
disp(fResult);
plot(x1List, x2List, 'r-*');
This is my Strong Wolfe Condition code:
function [learningRate] = LR(x, f)
syms x1 x2;
f(x1, x2) = f;
lr = 0.00001:0.001:1;
c1 = rand();
c2 = c1 + rand() * (1 - c1);
gradi = double(subs(gradient(f, [x1, x2]),{x1,x2},{x(1,1),x(1,2)}));
p = -gradi;
for i = 1:length(lr)
disp(i);
xap = x + lr(i) * transpose(p);
con1 = double(subs(f(xap(1,1), xap(1,2)),{x1,x2},{xap(1,1), xap(1,2)}));
con2 = double(subs(f(x(1,1), x(1,2)),{x1,x2},{x(1,1),x(1,2)})) + (c1 * (lr(i) * transpose(gradi)) * p);
con3 = abs(transpose(double(subs(gradient(f, [x1, x2]),{x1,x2},{xap(1,1),xap(1,2)}))) * p);
con4 = c2 * abs(transpose(gradi) * p);
if (con1 <= con2 && con3 <= con4)
learningRate = lr(i);
break
end
end
end