15

I am implementing a batch gradient descent on Matlab. I have a problem with the update step of theta. theta is a vector of two components (two rows). X is a matrix containing m rows (number of training samples) and n=2 columns (number of features). Y is an m rows vector.

During the update step, I need to set each theta(i) to

theta(i) = theta(i) - (alpha/m)*sum((X*theta-y).*X(:,i))

This can be done with a for loop, but I can't figure out how to vectorize it (because of the X(:,i) term).

Any suggestion?

Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
bigTree
  • 2,103
  • 6
  • 29
  • 45

2 Answers2

39

Looks like you are trying to do a simple matrix multiplication, the thing MATLAB is supposedly best at.

theta = theta - (alpha/m) * (X' * (X*theta-y));
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • @MadPhysicist works great thanks. By the way, I figured out this is not the good way of performing gradient descent for it doesn't update all the features simultaneously – bigTree Dec 23 '13 at 01:45
  • Isn't this only correct where `X = [ones(m, 1), data(:,1)]`? Because unlike `theta_1`, `theta_0` shouldn't be multiplied by `x^i` so `X` must contain a column of `1s`? – Quaker Feb 27 '16 at 17:25
  • Can this be further simplified as theta - (alpha/m)*(theta - A' * y) ? – XPD Sep 17 '20 at 04:15
  • 1
    @XPD only if `X'*X == I` – Mad Physicist Sep 17 '20 at 05:38
4

In addition to the answer given by Mad Physicist, the following can also be applied.

theta = theta - (alpha/m) * sum( (X * theta - y).* X )';

Rishu
  • 95
  • 1
  • 5