To implement L2 regularization for logistic regression we add L2 norm to the base loss:
With multilayer Neural Networks we do the same, but additionally, we increase per loss weight derivative of the weight during backward propagation:
The question is: why don't we just do the same for NN?
I can guess that it is connected with the fact that NN have multile layers, but I do not understand how and why it is working.