Why is eager execute required when using my custom loss function?

Question

QUESTION: Why does my custom loss function require run_eagerly parameter to be set to True in the compile method in TensorFlow?

BACKGROUND: I have built a CNN with TensorFlow 2.8.1 to classify CIFAR-100 images using a custom loss function. The CIFAR dataset includes 32x32-pixel RGB images of 100 fine classes (e.g., bear, car) categorized into 20 coarse classes (e.g., large omnivore, vehicle). My custom loss function is a weighted sum of two other loss functions (see code below). The first component is the crossentropy loss for the fine label. The second component is the crossentropy loss for the coarse label. My hope is that this custom loss function will enforce accurate classification of the coarse label to get a more accurate classifications of the fine label (fingers crossed). The comparator will be crossentropy loss of just the fine label (the baseline model). Note that to derive the coarse (hierarchical) loss component, I had to map the y_true (true fine label, integer) and y_pred (predicted softmax probabilities for the fine labels, vector) to the y_true_coarse_int (true coarse label, integer) and y_pred_coarse_hot (predicted coarse label, one hot encoded vector), respectively. FineInts_to_CoarseInts is a python dictionary that allows this mapping.

# THIS CODE CELL IS TO DEFINE A CUSTOM LOSS FUNCTION

def crossentropy_loss(y_true, y_pred):
    return SparseCategoricalCrossentropy()(y_true, y_pred)

def hierarchical_loss(y_true, y_pred):
    y_true = tensorflow.cast(y_true, dtype=float)
    y_true_reshaped = tensorflow.reshape(y_true,  -1)
    y_true_coarse_int = [FineInts_to_CoarseInts[K.eval(y_true_reshaped[i])] for i in range(y_true_reshaped.shape[0])]
    y_true_coarse_int = tensorflow.cast(y_true_coarse_int, dtype=tensorflow.float32)

    y_pred = tensorflow.cast(y_pred, dtype=float)
    y_pred_int = tensorflow.argmax(y_pred, axis=1)
    y_pred_coarse_int = [FineInts_to_CoarseInts[K.eval(y_pred_int[i])] for i in range(y_pred_int.shape[0])]
    y_pred_coarse_int = tensorflow.cast(y_pred_coarse_int, dtype=tensorflow.float32)
    y_pred_coarse_hot = to_categorical(y_pred_coarse_int, 20)

    return SparseCategoricalCrossentropy()(y_true_coarse_int, y_pred_coarse_hot)

def custom_loss(y_true, y_pred):
    H = 0.5
    total_loss = (1 - H) * crossentropy_loss(y_true, y_pred) + H * hierarchical_loss(y_true, y_pred)
    return total_loss

In the loss function I am using TensorFlow tensor operations or Keras backend operations, so I do not understand why I have to compile the model with run_eagerly=True (see below). The one potential problem I see is the use of K.eval but I've seen other code that used this and compiling with eager execution was not required. Therefore, I am assuming that this can't be the reasons.

# THIS CODE CELL IS TO COMPILE THE MODEL

model.compile(optimizer="adam", loss=custom_loss, metrics="accuracy", run_eagerly=True)

The full code is below:

# THIS CODE CELL LOADS THE PACKAGES USED IN THIS NOTEBOOK

# Load core packages for data analysis and visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sn
import sys

!{sys.executable} -m pip install pydot
!{sys.executable} -m pip install graphviz

# Load deep learning packages
import tensorflow                                                                           
from tensorflow.keras.datasets.cifar100 import load_data      
from tensorflow.keras import (Model, layers)          
from tensorflow.keras.losses import SparseCategoricalCrossentropy
import tensorflow.keras.backend as K
from tensorflow.keras.utils import (to_categorical, plot_model)
from tensorflow.lookup import (StaticHashTable, KeyValueTensorInitializer)

# Load model evaluation packages
import sklearn
from sklearn.metrics import (confusion_matrix, classification_report)

# Print versions of main ML packages
print("Tensorflow version " + tensorflow.__version__)
print("Scikit learn version " + sklearn.__version__)

# THIS CODE CELL LOADS DATASETS AND CHECKS DATA DIMENSIONS

# There is an option to load the "fine" (100 fine classes) or "coarse" (20 super classes) labels with integer (int) encodings
# We will load both labels for hierarchical classification tasks
(x_train, y_train_fine_int), (x_test, y_test_fine_int) = load_data(label_mode="fine")
(_, y_train_coarse_int), (_, y_test_coarse_int) = load_data(label_mode="coarse")

# EXTRACT DATASET PARAMETERS FOR USE LATER ON
num_fine_classes = 100
num_coarse_classes = 20
input_shape = x_train.shape[1:]  

# THIS CODE CELL PROVIDES THE CODE TO LINK INTEGER LABELS TO MEANINGFUL WORD LABELS
# Fine and coarse labels are provided as integers.  We will want to link them both to meaningful world labels.



# CREATE A DICTIONARY TO MAP THE 20 COARSE LABELS TO THE 100 FINE LABELS

# This mapping comes from https://keras.io/api/datasets/cifar100/ 
# Except "computer keyboard" should just be "keyboard" for the encoding to work
CoarseLabels_to_FineLabels = {
    "aquatic mammals":                  ["beaver", "dolphin", "otter", "seal", "whale"],
    "fish":                             ["aquarium fish", "flatfish", "ray", "shark", "trout"],
    "flowers":                          ["orchids", "poppies", "roses", "sunflowers", "tulips"],
    "food containers":                  ["bottles", "bowls", "cans", "cups", "plates"],
    "fruit and vegetables":             ["apples", "mushrooms", "oranges", "pears", "sweet peppers"],
    "household electrical devices":     ["clock", "keyboard", "lamp", "telephone", "television"],
    "household furniture":              ["bed", "chair", "couch", "table", "wardrobe"],
    "insects":                          ["bee", "beetle", "butterfly", "caterpillar", "cockroach"],
    "large carnivores":                 ["bear", "leopard", "lion", "tiger", "wolf"],
    "large man-made outdoor things":    ["bridge", "castle", "house", "road", "skyscraper"],
    "large natural outdoor scenes":     ["cloud", "forest", "mountain", "plain", "sea"],
    "large omnivores and herbivores":   ["camel", "cattle", "chimpanzee", "elephant", "kangaroo"],
    "medium-sized mammals":             ["fox", "porcupine", "possum", "raccoon", "skunk"],
    "non-insect invertebrates":         ["crab", "lobster", "snail", "spider", "worm"],
    "people":                           ["baby", "boy", "girl", "man", "woman"],
    "reptiles":                         ["crocodile", "dinosaur", "lizard", "snake", "turtle"],
    "small mammals":                    ["hamster", "mouse", "rabbit", "shrew", "squirrel"],
    "trees":                            ["maple", "oak", "palm", "pine", "willow"],
    "vehicles 1":                       ["bicycle", "bus", "motorcycle", "pickup" "truck", "train"],
    "vehicles 2":                       ["lawn-mower", "rocket", "streetcar", "tank", "tractor"]
}

# CREATE A DICTIONARY TO MAP THE INTEGER-ENCODED COARSE LABEL TO THE WORD LABEL
# Create list of Course Labels
CoarseLabels = list(CoarseLabels_to_FineLabels.keys())

# The target variable in CIFER100 is encoded such that the coarse class is assigned an integer based on its alphabetical order
# The CoarseLabels list is already alphabetized, so no need to sort
CoarseInts_to_CoarseLabels = dict(enumerate(CoarseLabels))

# CREATE A DICTIONARY TO MAP THE WORD LABEL TO THE INTEGER-ENCODED COARSE LABEL
CoarseLabels_to_CoarseInts = dict(zip(CoarseLabels, range(20)))


# CREATE A DICTIONARY TO MAP THE 100 FINE LABELS TO THE 20 COARSE LABELS
FineLabels_to_CoarseLabels = {}
for CoarseLabel in CoarseLabels:
    for FineLabel in CoarseLabels_to_FineLabels[CoarseLabel]:
        FineLabels_to_CoarseLabels[FineLabel] = CoarseLabel
        
# CREATE A DICTIONARY TO MAP THE INTEGER-ENCODED FINE LABEL TO THE WORD LABEL
# Create a list of the Fine Labels
FineLabels = list(FineLabels_to_CoarseLabels.keys())

# The target variable in CIFER100 is encoded such that the fine class is assigned an integer based on its alphabetical order
# Sort the fine class list.  
FineLabels.sort()
FineInts_to_FineLabels = dict(enumerate(FineLabels))


# CREATE A DICTIONARY TO MAP THE INTEGER-ENCODED FINE LABELS TO THE INTEGER-ENCODED COARSE LABELS
b = list(dict(sorted(FineLabels_to_CoarseLabels.items())).values())
FineInts_to_CoarseInts = dict(zip(range(100), [CoarseLabels_to_CoarseInts[i] for i in b]))

#Tensor version of dictionary
#fine_to_coarse = tensorflow.constant(list((FineInts_to_CoarseInts).items()), dtype=tensorflow.int8)




# THIS CODE CELL IS TO BUILD A FUNCTIONAL MODEL

inputs = layers.Input(shape=input_shape)
x = layers.BatchNormalization()(inputs)

x = layers.Conv2D(64, (3, 3), padding='same', activation="relu")(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Dropout(0.30)(x)

x = layers.Conv2D(256, (3, 3), padding='same', activation="relu")(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Dropout(0.30)(x)

x = layers.Conv2D(256, (3, 3), padding='same', activation="relu")(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Dropout(0.30)(x)

x = layers.Conv2D(1024, (3, 3), padding='same', activation="relu")(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Dropout(0.30)(x)

x = layers.GlobalAveragePooling2D()(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.30)(x)

x = layers.Dense(512, activation = "relu")(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.30)(x)

output_fine = layers.Dense(num_fine_classes, activation="softmax", name="output_fine")(x)

model = Model(inputs=inputs, outputs=output_fine)



# THIS CODE CELL IS TO DEFINE A CUSTOM LOSS FUNCTION

def crossentropy_loss(y_true, y_pred):
    return SparseCategoricalCrossentropy()(y_true, y_pred)

def hierarchical_loss(y_true, y_pred):
    y_true = tensorflow.cast(y_true, dtype=float)
    y_true_reshaped = tensorflow.reshape(y_true,  -1)
    y_true_coarse_int = [FineInts_to_CoarseInts[K.eval(y_true_reshaped[i])] for i in range(y_true_reshaped.shape[0])]
    y_true_coarse_int = tensorflow.cast(y_true_coarse_int, dtype=tensorflow.float32)

    y_pred = tensorflow.cast(y_pred, dtype=float)
    y_pred_int = tensorflow.argmax(y_pred, axis=1)
    y_pred_coarse_int = [FineInts_to_CoarseInts[K.eval(y_pred_int[i])] for i in range(y_pred_int.shape[0])]
    y_pred_coarse_int = tensorflow.cast(y_pred_coarse_int, dtype=tensorflow.float32)
    y_pred_coarse_hot = to_categorical(y_pred_coarse_int, 20)

    return SparseCategoricalCrossentropy()(y_true_coarse_int, y_pred_coarse_hot)

def custom_loss(y_true, y_pred):
    H = 0.5
    total_loss = (1 - H) * crossentropy_loss(y_true, y_pred) + H * hierarchical_loss(y_true, y_pred)
    return total_loss




# THIS CODE CELL IS TO COMPILE THE MODEL

model.compile(optimizer="adam", loss=crossentropy_loss, metrics="accuracy", run_eagerly=False)


# THIS CODE CELL IS TO TRAIN THE MODEL

history = model.fit(x_train, y_train_fine_int, epochs=200, validation_split=0.25, batch_size=100)


# THIS CODE CELL IS TO VISUALIZE THE TRAINING

history_frame = pd.DataFrame(history.history)
history_frame.to_csv("history.csv")
history_frame.loc[:, ["accuracy", "val_accuracy"]].plot()
history_frame.loc[:, ["loss", "val_loss"]].plot()
plt.show()


# THIS CODE CELL IS TO EVALUATE THE MODEL ON AN INDEPENDENT DATASET

score = model.evaluate(x_test, y_test_fine_int, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Why is eager execute required when using my custom loss function?

0 Answers0