Which loss function should I use if my data is multi-label and my target vectors are not one-hot encoded?

Question

I just read through this link that explains the difference between binary cross-entropy and categorical cross-entropy, and in particular, I had a question about this picture:

The author addressed the multi-label case where your target (or ground truth) labels are one-hot encoded, but what loss function would you use if your target labels were not one-hot encoded? For example, if only half of the panda was in the image, and I then labeled the image as [1, 0, 0.5], would I still use binary-cross entropy in this case? How does the math work out for cases where the target vector is not binary?

score 0 · Answer 1 · answered Dec 15 '21 at 08:33

The author of the article you are referring to has mentioned that cross-entropy works for all cases as long as you have two probability vectors. One for the target and one for the predicted.

if only half of the panda was in the image, and I then labeled the image as [1, 0, 0.5], would I still use binary-cross entropy in this case?

Answer: No, you would be using multi-label cross entropy here as there is presence of multiple labels in the same image. This target probability vector can be converted to one-hot encoded vector. It will look like [1,0,1]. 0.5 is converted to 1 because object with that label is present in the image. For calculation you can refer to the article where he explains multi-label classification where he says:

Our target can represent multiple (or even zero) classes at once. We compute the binary cross-entropy for each class separately and then sum them up for the complete loss.

Which loss function should I use if my data is multi-label and my target vectors are not one-hot encoded?

1 Answers1