Incorrect Results from Gradient Descent in Matlab

Question

I'm taking the course in Matlab, and I have done a gradient descent implementation but it gives incorrect results.

The code:

for iter = 1:num_iters

sumTheta1 = 0;
sumTheta2 = 0;
for s = 1:m
    sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s);
    sumTheta2 = theta(1) + theta(2) .* X(s,2) - y(s) .* X(s,2);
end

theta(1) = theta(1) - alpha .* (1/m) .* sumTheta1;
theta(2) = theta(2) - alpha .* (1/m) .* sumTheta2;

J_history(iter) = computeCost(X, y, theta);

end

This is the important part. I think the implementation of the formula is correct, even though it's not optimized. The formula is:

theta1 = theta1 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))
theta2 = theta2 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))(x(i))

So where could the problem be?

EDIT: CODE updated

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

m = length(y); % number of training examples
J_history = zeros(num_iters, 1);


for iter = 1:num_iters

for s = 1:m

sumTheta1 = ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s));
sumTheta2 = ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s)) .* X(s,2);
end

temp1 = theta(1) - alpha .* (1/m) .* sumTheta1;
temp2 = theta(2) - alpha .* (1/m) .* sumTheta2;

theta(1) = temp1;
theta(2) = temp2;

J_history(iter) = computeCost(X, y, theta);

end

end

EDIT(2): Fixed it, working code.

Got it, it was the +Dan hint that did it I will accept his answer and still put the code here to anyone stuck :), cheers.

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

 m = length(y); % number of training examples
 J_history = zeros(num_iters, 1);


for iter = 1:num_iters

sumTheta1 = 0;
sumTheta2 = 0;

for s = 1:m

sumTheta1 = sumTheta1 + ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s));
sumTheta2 = sumTheta2 + (((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s))) .* X(s,2);
end

temp1 = theta(1) - alpha .* (1/m) .* sumTheta1;
temp2 = theta(2) - alpha .* (1/m) .* sumTheta2;

theta(1) = temp1;
theta(2) = temp2;

% Save the cost J in every iteration    
J_history(iter) = computeCost(X, y, theta);

end

end

You know that Matlab will not understand the syntax `(alpha)(1/m)` right? You need to explicitly put in a multiplication sign, eg `(alpha)*(1/m)*(summation_i^m * (theta1 + theta2 * x(i) - y(i))) * x(i)`. Does this make sense? — Colin T Bowers, Oct 16 '13 at 03:02
Yes that is the formula that says in the notes, not the matlab implementation, the matlab code is above that. Because I don't know how to write formulas here. — Pedro.Alonso, Oct 16 '13 at 03:08
Not sure if there is a generally agreed upon method for writing equations on SO. Personally, I prefer *not* to use code-highlighting on equations, but some users do. If you wanted to go all-out, you can use the Google API's - see [here](http://meta.stackexchange.com/questions/76902/how-can-i-write-math-formula-in-a-stack-overflow-question) — Colin T Bowers, Oct 17 '13 at 03:18

score 1 · Accepted Answer · answered Oct 16 '13 at 06:25

At first glance I notice that your sumTheta1 is not actually summing but rather replacing itself each iteration. I think you meant:

sumTheta1 = sumTheta1 + theta(1) + theta(2) .* X(s,2) - y(s);

And the same for sumTheta2

But for future reference you could replace this (corrected) loop:

for s = 1:m
    sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s);
    sumTheta2 = theta(1) + theta(2) .* X(s,2) - y(s) .* X(s,2);
end

with this vectorized formula

sumTheta1 = sum(theta(1) + theta(2)*X(:,2) - y);
sumTheta2 = sum(theta(1) + theta(2)*X(:,2) - y.*X(:,2))

Well I did the change and it got worse,the thetas I got with the first code where -3.120881 1.112813 very close to the minimum, with the addition of sumTheta(x) to the loop it blows up -73.069510 16.165062 — Pedro.Alonso, Oct 16 '13 at 13:58

Dennis Jaheruddin · Answer 2 · 2013-10-16T14:12:53.143

1

If I see this formula

theta1 = theta1 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))

I guess the matlab equivalent would be:

theta1 = theta1 - alpha/m*(theta1 + theta2)*sum(x-y)

Probably you can determine m as follows:

m =length(x);

However, your two formulas make me wonder whether you want to calculate them sequentially or simultaneously.

In the second case create a temporary variable and use this in the calculation.

myFactor = alpha/m*(theta1_previous + theta2_previous)

theta1 = theta1_previous - myFactor*sum(x-y)
theta2 = theta2_previous - myFactor*sum((x-y).*x)

edited Oct 16 '13 at 14:12

answered Oct 16 '13 at 09:39

Dennis Jaheruddin

21,208
8
66
122

Gradient descent should be simultaneous – Dan Oct 16 '13 at 13:56
Is simultaneous, I think in this line: theta(1) = theta(1) - alpha .* (1/m) .* sumTheta1; theta(2) = theta(2) - alpha .* (1/m) .* sumTheta2; – Pedro.Alonso Oct 16 '13 at 14:00
@Pedro.Alonso Of course there are multiple ways to do it. I have updated my answer with one that should do the trick. – Dennis Jaheruddin Oct 16 '13 at 14:15

score 1 · Answer 3 · answered Nov 12 '13 at 20:16

1

Vectorized version:

for iter = 1:num_iters
    theta = theta - (alpha .* X'*(X * theta - y) ./m);
    J_history(iter) = computeCost(X, y, theta);
end

answered Nov 12 '13 at 20:16

Franck Dernoncourt

77,520
72
342
501

I have another problem [chat](http://chat.stackoverflow.com/rooms/41050/logistic-regression) – Pedro.Alonso Nov 12 '13 at 21:26

Incorrect Results from Gradient Descent in Matlab

3 Answers3