0

My question is based on the data from Coursera course - https://www.coursera.org/learn/machine-learning/, but after a search is appears to be a common problem.

The gradient descent works perfectly on normalize data (pic.1), but goes in wrong direction on original data(pic.2) with J(cost function) growing very fast toward infinity. The difference between the parameters values is about 10^3.

I thought that normalization is required for better execution speed, I really can't see a reason of this growth in the cost function, even after a lot of search. Decreasing 'alpha', e.g. making it 0.001 or 0.0001 doesn't help either.

Please post if you have any ideas!

P.S. (I had manually provided matrices to the functions, where X_buf - normalized version and X_basic - original; Y - vector of all examles Q - theta vector, alpha - leaning rate).

function [theta, J_history] = gradientDescentMulti(X, Y, theta, alpha, num_iters)

m = length(Y); 
J_history = zeros(num_iters, 1);

for iter = 1:num_iters
theta = theta - (alpha/m)*X'*(X*theta-Y);
J_history(iter) = computeCostMulti(X, Y, theta);
end

end

And the second function:

function J = computeCostMulti(X, Y, theta)

m = length(Y); % number of training examples
J = 0;
J = (1/(2*rows(X)))*(X*theta-Y)'*(X*theta-Y);

end

Screenshots

  • Have you tried the course bulletin board for help? The TAs are pretty responsive. I don't see a problem with your code, but it's also been a year since I worked inside the ML functionality. – Prune Jun 06 '16 at 18:44
  • 1
    try learning rate 1e-10 (depending on the scale of your unnormalized data) – lejlot Jun 06 '16 at 22:15
  • It is unclear where are the pictures you refer to. – etov Jun 08 '16 at 10:43

0 Answers0