2

I am rather new to deep learning and got some questions on performing a multi-label image classification task with keras convolutional neural networks. Those are mainly referring to evaluating keras models performing multi label classification tasks. I will structure this a bit to get a better overview first.

Problem Description

The underlying dataset are album cover images from different genres. In my case those are electronic, rock, jazz, pop, hiphop. So we have 5 possible classes that are not mutual exclusive. Task is to predict possible genres for a given album cover. Each album cover is of size 300px x 300px. The images are loaded into tensorflow datasets, resized to 150px x 150px. Dataset distribution in training, validation and test data.

Model Architecture

The architecture for the model is the following.

import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

data_augmentation = keras.Sequential(
  [
    layers.experimental.preprocessing.RandomFlip("horizontal", 
                                                 input_shape=(img_height, 
                                                              img_width,
                                                              3)),
   layers.experimental.preprocessing.RandomFlip("vertical"),
    layers.experimental.preprocessing.RandomRotation(0.4),
   layers.experimental.preprocessing.RandomZoom(height_factor=(0.2, 0.6), width_factor=(0.2, 0.6))
  ]
)

def create_model(num_classes=5, augmentation_layers=None):
  model = Sequential()

  # We can pass a list of layers performing data augmentation here
  if augmentation_layers:
    # The first layer of the augmentation layers must define the input shape
    model.add(augmentation_layers)
    model.add(layers.experimental.preprocessing.Rescaling(1./255))
  else:
    model.add(layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)))

  model.add(layers.Conv2D(32, (3, 3), activation='relu'))
  model.add(layers.MaxPooling2D((2, 2)))
  model.add(layers.Conv2D(64, (3, 3), activation='relu'))
  model.add(layers.MaxPooling2D((2, 2)))
  model.add(layers.Conv2D(128, (3, 3), activation='relu'))
  model.add(layers.MaxPooling2D((2, 2)))
  model.add(layers.Conv2D(128, (3, 3), activation='relu'))
  model.add(layers.MaxPooling2D((2, 2)))
  model.add(layers.Flatten())
  model.add(layers.Dense(512, activation='relu'))

  # Use sigmoid activation function. Basically we train binary classifiers for each class by specifiying binary crossentropy loss and sigmoid activation on the output layer.
  model.add(layers.Dense(num_classes, activation='sigmoid'))
  model.summary()

  return model

I'm not using the usual metrics here like standard accuracy. In this paper I read that you cannot evaluate multi-label classification models with the usual methods. In chapter 7. evaluation metrics the hamming loss and an adjusted accuracy (variant of exact match) are presented which I use for this model.

The hamming loss is already provided by tensorflow-addons (see here) and an implementation of the subset accuracy I found here (see here).

from tensorflow_addons.metrics import HammingLoss

hamming_loss = HammingLoss(mode="multilabel", threshold=0.5)

def subset_accuracy(y_true, y_pred):
    # From https://stackoverflow.com/questions/56739708/how-to-implement-exact-match-subset-accuracy-as-a-metric-for-keras

    threshold = tf.constant(.5, tf.float32)
    gtt_pred = tf.math.greater(y_pred, threshold)
    gtt_true = tf.math.greater(y_true, threshold)
    accuracy = tf.reduce_mean(tf.cast(tf.equal(gtt_pred, gtt_true), tf.float32), axis=-1)
    return accuracy

 # Create model
 model = create_model(num_classes=5, augmentation_layers=data_augmentation)

 # Compile model  
 model.compile(loss="binary_crossentropy", optimizer="adam", metrics=[subset_accuracy, hamming_loss])

 # Fit the model
 history = model.fit(training_dataset, epochs=epochs, validation_data=validation_dataset, callbacks=callbacks)

Problem with this model

When training the model subset_accuracy hamming_loss are at some point stuck which looks like the following: Training History What could cause this behaviour. I am honestly a little bit lost right now. Could this be a case of the dying relu problem? Or is it wrong use of the metrics mentioned or is the implementation of those maybe wrong?

So far I tried to test differen optimizers and lowering the learning rate (e.g. from 0.01 to 0.001, 0.0001, etc..) but that didn't help either.

Maybe somebody has an idea that can help me. Thanks in advance!

Phil
  • 89
  • 2
  • 12

1 Answers1

2

I think you need to tune your model's hyperparameters right. For that I'll recommend try using Keras Tuner library. This would take some time to run, but will fetch you right set of hyperparameters.

Dharman
  • 30,962
  • 25
  • 85
  • 135
MiKee
  • 36
  • 3
  • Thanks I'll look into it. Suppose you are talking about this https://www.tensorflow.org/tutorials/keras/keras_tuner? – Phil Sep 26 '21 at 18:33