I think I'm following the same online tutorial as what is mentioned in the post: How to convert deep learning gradient descent equation into python
I understand we have to calculate the cost and db but my question is why do they put axis=0 in both equations? In other words, I do not understand the axis=0, what is it used for in this calculation. What would be the result if you do the calculation without axis=0
import numpy as np
cost = -1*((np.sum(np.dot(Y,np.log(A))+np.dot((1-Y),(np.log(1-A))),axis=0))/m)
db = np.sum((A-Y),axis=0)/m