Multi-label loss function for large dataset

Question

I'm training a CNN for multi-labels but it has around 160 labels, so when using normal CNN Architecture with sigmoid for the output layer and binary_crossentropy for the loss the network is still biased for the zeros, because the loss function takes all the outputs and normalize them, so the least loss will happen when all the output is zeros even the right labels because it is normalized. so does anyone have a solution?

score 0 · Answer 1 · answered Apr 02 '20 at 07:29

0

Use categorical cross entropy instead of binary cross entropy and use softmax instead of sigmoid.

answered Apr 02 '20 at 07:29

bsquare

943
5
10

Multi-label loss function for large dataset

1 Answers1