1

I am trying to compute the derivative of softmax function. I have a 2d numpy array and I am calculating the softmax for the array along axis 1. My python code for the same is:

def softmax(z):

     return np.exp(z) / np.sum(np.exp(z), axis=1, keepdims=True)

Now my python code for calculating the derivative of softmax equation is:

def softmax_derivative(Q):

    x=softmax(Q)
    s=x.reshape(-1,1)
    return (np.diagflat(s) - np.dot(s, s.T))

Is this the correct approach ?

Also if my numpy array has a shape (3,3) then what would be the shape of the array returned by the softmax derivative? Would the shape of the array returned be (9,9) ?

wizzup
  • 2,361
  • 20
  • 34
cherry13
  • 11
  • 3

1 Answers1

-1

I would subtract the maximum value of z and do something like:

def softmax(z):
     exps = np.exp(z - z.max())
     return exps/np.sum(exps)

to improve stability, but, otherwise what you are doing is correct.

unrahul
  • 1,281
  • 1
  • 10
  • 21