Geometric proof of Convergence of Perceptron Algorithm

Question

I have a question considering Geoffrey Hinton's proof of convergence of the perceptron algorithm: Lecture Slides.

On slide 23 it says:

Every time the perceptron makes a mistake, the squared distance to all of these generously feasible weight vectors is always decreased by at least the squared length of the update vector.

My problem is that I can make the distance reduction arbitrarily small by moving the feasible vector to the right. See here for a depiction:

vector diagram

So how can the distance be guaranteed to shrink by the squared length of the update vector (in blue), if I can make it arbitrarily small?

This question is better suited for http://stats.stackexchange.com/ — Artem Sokolov, Feb 01 '17 at 21:52

score 0 · Answer 1 · answered Feb 01 '17 at 21:39

If I'm reading his proof correctly, there are two reasons:

This concerns the set of feasible vectors, not just one.
The reference is to the sum of the squared distances to the individual vectors. Note that the update moves the new point farther from the brown dot (another feasible vector).
Moving one vector will change the update vector.

score -1 · Answer 2 · answered Jul 24 '17 at 09:20

The proof states that the "squared distance" a^2 + b^2, not straight-line distances (Euclidean distance) which would cause issues. Since we update the "bad" weight vector 'vertically' while still keeping the same 'horizontal' distance, we are always guaranteed to get closer to the generously feasible vector by at least the squared length of the update vector. I believe this should generalize to more dimensions. Please correct me if I am wrong.

Geometric proof of Convergence of Perceptron Algorithm

2 Answers2