I am trying to compute the derivative of softmax function. I have a 2d numpy array and I am calculating the softmax for the array along axis 1. My python code for the same is:
def softmax(z):
return np.exp(z) / np.sum(np.exp(z), axis=1, keepdims=True)
Now my python code for calculating the derivative of softmax equation is:
def softmax_derivative(Q):
x=softmax(Q)
s=x.reshape(-1,1)
return (np.diagflat(s) - np.dot(s, s.T))
Is this the correct approach ?
Also if my numpy array has a shape (3,3) then what would be the shape of the array returned by the softmax derivative? Would the shape of the array returned be (9,9) ?