I'm implementing gradient descent for an assignment and am confused about when the weights are suppose to stop updating. Do I stop updating the weights when they don't change very much, i.e. when the weighti - weightprevious i <= (some threshhold).
Also, with the way I'm currently implementing it above, Weight1 can be finished before Weight2. Is that right or should all the weights finish at the same time?