2

In my experiment I am trying to train a neural network to detect if patients exhibit symptom A, B, C, D. My data consists of different angled photos of each patient along with whether or not they have symptom A, B, C, D.

Right now in, pytoch, I am using MSELoss and calculating my test error as the total number of correct classifications out of the total number of classifications. I'm guessing this is too naive and even inappropriate.

An example of a test error computation would be like this: Suppose we have 2 patients with two images each of them. Then there would be 16 total classifications (1 for whether patient 1 has symptom A, B, C, D in photo 1, etc). And if the model correctly predicted that in photo 1 patient 1 exhibited symptom A then that would add 1 to the total number of correct classifications.

CypherX
  • 7,019
  • 3
  • 25
  • 37
user9781778
  • 87
  • 1
  • 9

1 Answers1

5

I suggest to use binary-crossentropy in multi-class multi-label classifications. This may seem counterintuitive for multi-label classification, but keep in mind that the goal here is to treat each output label as an independent distribution (or class).

In pytorch you can use torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean'). This creates a criterion that measures the Binary Cross Entropy between the target and the output.

aminrd
  • 4,300
  • 4
  • 23
  • 45
  • 3
    Worth mentioning that the output layer should have sigmoid activation. Or alternatively, use no activation and `BCEWithLogitsLoss` which has a sigmoid built in and is more numerically stable. – sebrockm Nov 19 '19 at 08:16
  • But OP said it is multi-class, so wouldn't we want regular cross entropy? Not binary. – conv3d Nov 19 '19 at 21:16
  • 2
    @jchaykow OP is trying to classify one patient along binary classes A, B, C, D, which is a binary classification problem with 4 predictions. – Eduardo Barrera Nov 19 '19 at 21:37