L2 regularization in Logistic regression vs NN

Question

To implement L2 regularization for logistic regression we add L2 norm to the base loss:

With multilayer Neural Networks we do the same, but additionally, we increase per loss weight derivative of the weight during backward propagation:

The question is: why don't we just do the same for NN?

I can guess that it is connected with the fact that NN have multile layers, but I do not understand how and why it is working.

I’m voting to close this question because it is not about programming as defined in the [help] but about ML theory. — desertnaut, May 30 '21 at 15:25

score 1 · Answer 1 · answered May 30 '21 at 15:53

As I know the basic approach is to give a penalty in empirical rik minimization problem, so maybe the other penalty comes from other theoretical result which I don't know. If you know to take a look in a theoretical aspects of ML I strong reccommend you this book https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf.

1 Answers1