I'm interested in building the derivative of Softmax in Tensorflow, and as a new user I'm stuck.
The closest code I can find is a NumPy version Softmax derivative in NumPy approaches 0 (implementation). Code is below. I am able to translate the softmax portion into tensorflow easily, but I'm stuck as to how apply the derivative section to tensorflow - the three lines under "if derivative" are giving me trouble. How would you go about building the three lines of the derivative portion?
Thank you.
Derivative Portion
if derivative:
J = - signal[..., None] * signal[:, None, :] # off-diagonal Jacobian
iy, ix = np.diag_indices_from(J[0])
J[:, iy, ix] = signal * (1. - signal) # diagonal
return J.sum(axis=1)
Here is the full code from the link above.
def softmax_function( signal, derivative=False ):
# Calculate activation signal
e_x = np.exp( signal )
signal = e_x / np.sum( e_x, axis = 1, keepdims = True )
if derivative:
J = - signal[..., None] * signal[:, None, :] # off-diagonal Jacobian
iy, ix = np.diag_indices_from(J[0])
J[:, iy, ix] = signal * (1. - signal) # diagonal
return J.sum(axis=1)
else:
# Return the activation signal
return signal