2

I tried normal equation, and the result was correct. However, when I used gradient descent, the figure turned out to be wrong. I referred to online resources, but I failed to find out what's wrong. I don't think there's anything special in the following code.

clear;
clc;
m = 100; % generate 100 points
noise = randn(m,1); % 100 noise of normal distribution
x = rand(m, 1) * 10; % generate 100 x's ranging from 0 to 10
y = 10 + 2 * x + noise; 
plot (x, y, '.');
hold on;


X = [ones(m, 1) x];
theta = [0; 0];
plot (x, X * theta, 'y');
hold on;

% Method 1 gradient descent
alpha = 0.02; % alpha too big will cause going far away from the result
num_iters = 5;
[theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

% Method 2 normal equation
% theta = (pinv(X' * X )) * X' * y

plot (x, X * theta, 'r');



function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    m = length(y); 
    J_history = zeros(num_iters, 1);
    for iter = 1:num_iters,
        theta = theta - alpha * (1/m) * (X' * (X * theta - y));

        % plot (X(:, 2), X * theta, 'g');
        % hold on;

        J_history(iter) = costFunction(X, y, theta);
    end
end

function J = costFunction( X, y, theta )
    m = length(y);  
    predictions = X * theta; % prediction on all m examples 
    sqrErrors = (predictions - y).^2; % Squared errors
    J = 1/(2*m) * sum(sqrErrors); 
end
K.WANG
  • 71
  • 1
  • 2
  • 5
  • the code seems clean. Check machine learning coursera implementations of gradient descent on github. They might help you sorting this issue out – 16per9 Dec 15 '16 at 10:46

1 Answers1

1

Your code is correct. The problem is small number of iterations. One can take num_iters = 5000; and see that theta converges to right value ([10; 2]).