-3

I have implemented the required equations for backpropagation for the output layer but for the hidden layers, I am getting really confused with the chain rule. When the number of hidden layers is. more, it gets more confusing.

How to ease out hidden layer equations?

PS. I know calculus

1 Answers1

1

Crypt John, Welcome :)

Since you said you are fluent in calculus and have already completed the output layer backpropagation, it will be easy for you once you learn about memoization.

Every hidden layer has the error of the previous layer. Apply chain rule for one weight of the hidden layer. Let me post the equation for you.

https://1.bp.blogspot.com/-AqNDf3KFUq8/XahC0NdsGkI/AAAAAAAAEvU/cglTGiej4-0Q-0ZYw3NKAgvanAxU6KlMgCLcBGAsYHQ/s1600/Percep.PNG

In the above image you can see that differentiation is in flow as follows: 1. Error/ Sigmoid_Output 2. Sigmoid_Output/ Dot_Product 3. Dot_Product/ Sigmoid_Hidden 4. Sigmoid_Hidden/ Dot_Product_Hidden 5. Dot_Product_Hidden/ Your Weight !!!

The first two steps are from the chain rule from the output layer. Thus you don't have to calculate the 1st and 2nd results again and again.

Similarly for another hidden layer, 1st and 2nd derivative of the previous layer will be the same. This is called memoization.

Check out this webpage: Visit https://www.hellocodings.com/2019/10/step-by-step-back-propagation.html

Regards