1

I am trying to write a neural network MLP model from scratch. However, I am stuck on the derivative of softmax function. I know that the softmax function in python code is

def softmax(input_value):
    input_value -= np.max(input_value)
    return np.exp(input_value) / np.sum(np.exp(input_value))

However, I dont know how to write the code for the softmax derivative. Can anyone show me how to write the code in python? Thank you so much!

Kai Chan
  • 31
  • 2

1 Answers1

0
def derivative(self, x: np.ndarray) -> np.ndarray:
    n, k = x.shape
    D = np.zeros((k, k, n))
    for i in range(n):
        tmp = x[i:i+1, :]
        val = self.value(x)
        #D[:,:,i] = np.diag(val.reshape(-1)) - val.T.dot(val)
        D[:,:,i] = np.diag(val.reshape(-1)) - val.T * val
    return D
Kagome
  • 1
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 18 '22 at 18:41