2

Should the momentum be added also to the bias term of every node in the network or preferably only on weights?

1 Answers1

6

Bias and weights. If you just applied it to the weights, the bias would lag the weights, artificially increasing the error and slowing convergence.

Think of bias as simply one more weight -- an extra input that's always 1.

Sneftel
  • 40,271
  • 12
  • 71
  • 104
  • Ok thanks! So I simply subtract from my "delta bias" the quantity "alpha_coefficient * previous_delta_bias", is it right? – user1861248 Feb 02 '16 at 10:01