-3

How to train a Multi-label classification model when each label should return more than 1 class? Example: Image classification have 2 label: style with 4 classes and layout with 5 classes. An image in list should return 2 style and 3 layout like [1 0 1 0] [1 1 0 0 1]

My sample net:

class MyModel(nn.Module):
def __init__(self, n__classes=32):
    super().__init__()
    self.base_model = models.resnet50(pretrained=True).to(device)
    last_channel = self.base_model.fc.in_features
    self.base_model.fc = nn.Sequential()
    
    self.layout = nn.Sequential(
        nn.Dropout(0.2),
        nn.Linear(last_channel, n_classes_layout),
        nn.Sigmoid()
    )
    self.style = nn.Sequential(
        nn.Dropout(0.2),
        nn.Linear(last_channel, n_classes_style),
        nn.Sigmoid()
    )
def forward(self, x):
    base = self.base_model(x)
    return self.layout(base), self.style(base)

def loss_fn(outputs, targets):
    o1, o2 = outputs
    t1, t2 = targets
  • This is the very definition of *multi-label* classification, in contrast to *multi-class*, where the samples can belong to one class only. I kindly suggest you do some research first - this question is way too vague and broad for SO. – desertnaut Sep 29 '21 at 20:35

1 Answers1

1

I am not sure what you referring label to but it seems you have a multi output model predicting on one hand style, and on the other layout. So I am assuming you are dealing with a multi-task network with two "independent" outputs which are supervised separately with two-loss terms.

Consider one of those two: you can consider this multi-label classification task as outputting n values each one estimating the probability that the corresponding label is present in the input. In your first task, you have four classes, on the second task, you have five.

Unlike a single-label classification task, here you wish to output one activation per label: you can use a sigmoid activation per logit. Then apply the Binary Cross Entropy loss on each of those outputs. In PyTorch, you can use BCELoss, which handles multi-dimensional tensors. In practice you could concatenate the two outputs from the two tasks, do the same with the target labels and call the criterion once.

Ivan
  • 34,531
  • 8
  • 55
  • 100