Using the vectorized version of gradient as described at : gradient descent seems to fail
theta = theta - (alpha/m * (X * theta-y)' * X)';
The theta values are not being updated, so whatever initial theta value this is the values that is set after running gradient descent :
example1 :
m = 1
X = [1]
y = [0]
theta = 2
theta = theta - (alpha/m .* (X .* theta-y)' * X)'
theta =
2.0000
example2 :
m = 1
X = [1;1;1]
y = [1;0;1]
theta = [1;2;3]
theta = theta - (alpha/m .* (X .* theta-y)' * X)'
theta =
1.0000
2.0000
3.0000
Is theta = theta - (alpha/m * (X * theta-y)' * X)';
a correct vectorised implementation of gradient descent ?