5

I am doing a multilabel classification using some recurrent neural network structure. My question is about the loss function: my output will be vectors of true/false (1/0) values to indicate each label's class. Many resources said the Hamming loss is the appropriate objective. However, the Hamming loss has a problem in the gradient calculation: H = average (y_true XOR y_pred),the XOR cannot derive the gradient of the loss. So is there other loss functions for training multilabel classification? I've tried MSE and binary cross-entropy with individual sigmoid input.

William Chou
  • 157
  • 2
  • 8

1 Answers1

5

H = average(y_true*(1-y_pred)+(1-y_true)*y_pred)

is a continuous approximation of the hamming loss.

Juan Wang
  • 146
  • 2
  • 9