Looking at the Bellman-Ford algorithm, in step 2 you consider using every edge (u, v) to to find a shorter path to v and, if you see an improvement, you record it by setting predecessor[v] = u. This means that at each stage you know the predecessor of each node - so you can eliminate length two cycles by checking that predecessor[u] != v before you set predecessor[v] = u.
By eliminating these cycles you change the invariant of the induction - at each stage you are now finding the shortest route to u from s with at most i edges which does not include any length 2 cycles.
A cycle of length 3 or greater reachable from the source should still show up - the check for negative cycles looks for apparent improvements after you should have found every shortest path for lengths up to that necessary to visit every vertex.
Example:Consider G = {{A, B, C, D}, {AB=2, AC=2, BC=-3, BD=1, CD=1}}.
Updates, updating B then C then D:
A=0, B=C=D=infinity
A=0, B=2 from A, C=-1 from B, D=0 from C
A=0, B=1 from D, C=-2 from B, D=-1 from C
A=0, B=0 from D, C=-3 from B, D=-2 from C
A=-1 from C, B=-1 from D, C=-4 from B, D=-3 from C
...
Here is a proof that the distances will continue changing indefinitely in the presence of a negative cycle:
Suppose otherwise. Then there is an assignment of distances which is stable: no possible updating of any distance will decrease it. This means that the order in which edges are checked which might decrease a distance is irrelevant, since for this to be the case, every edge, when checked, leaves the distances unchanged.
Pick a point on a negative cycle and consider the path that goes along from that point until it wraps round and reaches itself again. Since checking the first edge in this path leaves everything unchanged, the distance at the far end of that edge minus the distance at the near end of that edge must be no more than the distance along the edge. Similarly, the distance two steps along the path minus the distance at the start of the path must be no more than the sum of the distances along the two edges concerned, or we would update the distance to the further of the two points. Carrying on, we work out that the distance at the end of the (circular) path must be no more than the start of the (circular path) plus the sum of the edges along that path, or something would have been updated. But the start and end of the path are the same point, because it is circular, and the sum of the distances along the edges is negative, because it is a negative cycle, so we reach a contradiction and there must in fact be some updating once we have checked all the edges along the circular path.