In training (sparse_categorical_crossentropy ) loss is coming up with a loss value but the validation (sparse_categorical_crossentropy) gives back nan.
It is a multi output model. One is a binary_crossentropy (which results a loss) and one is sparse_categorical_crossentropy (which does not result a loss in validation set)
Here is the relevant bit of code.
This is how the data looks
id boneage male boneage_cat
0 1377 180 0 3
1 1378 12 0 0
2 1379 94 0 2
3 1380 120 1 2
4 1381 82 0 1
Data Generator
train_datagen = ImageDataGenerator(rescale=1./255,
rotation_range=15,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
target_column = ["male", "boneage_cat"]
train_generator = train_datagen.flow_from_dataframe(dataframe=train_df,
directory = f"{BASE_DIR}{FILE_PREFIX}/",
x_col="id",
y_col=target_column,
batch_size=BATCH_SIZE,
seed=RANDOM_STATE,
class_mode="multi_output",
shuffle=True,
color_mode=COLOR_SPACE,
target_size=(IMG_SIZE, IMG_SIZE))
test_generator = test_datagen.flow_from_dataframe(dataframe=test_df,
directory = f"{BASE_DIR}{FILE_PREFIX}/",
x_col="id",
y_col=target_column,
batch_size=BATCH_SIZE,
seed=RANDOM_STATE,
class_mode="multi_output",
shuffle=False,
color_mode=COLOR_SPACE,
target_size=(IMG_SIZE, IMG_SIZE))
training_steps_per_epoch = train_generator.n // BATCH_SIZE
test_steps_per_epoch = test_generator.n // BATCH_SIZE
Layers above these are taken from inception_v3
gender_output = Dense(1, activation="sigmoid", name="gender_output")(dense)
boneage_cat_output = Dense(CAT_CLASSES, activation="softmax", name="boneage_cat_output")(dense)
model = Model(inputs=base.input, outputs=[gender_output, boneage_cat_output])
return model
Optimiser
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=LR), metrics=["accuracy"], loss=["binary_crossentropy", "sparse_categorical_crossentropy"])
This is how the output looks
Epoch 1/50
249/249 [==============================] - ETA: 0s - loss: 2.2914 - gender_output_loss: 0.7802 - boneage_cat_output_loss: 1.5112 - gender_output_accuracy: 0.5157 - boneage_cat_output_accuracy: 0.3680
Epoch 1: val_loss did not improve from inf
249/249 [==============================] - 108s 287ms/step - loss: 2.2914 - gender_output_loss: 0.7802 - boneage_cat_output_loss: 1.5112 - gender_output_accuracy: 0.5157 - boneage_cat_output_accuracy: 0.3680 - val_loss: nan - val_gender_output_loss: 0.6941 - val_boneage_cat_output_loss: nan - val_gender_output_accuracy: 0.4632 - val_boneage_cat_output_accuracy: 0.4194
Epoch 2/50
249/249 [==============================] - ETA: 0s - loss: 2.0244 - gender_output_loss: 0.6930 - boneage_cat_output_loss: 1.3314 - gender_output_accuracy: 0.5356 - boneage_cat_output_accuracy: 0.4191
Epoch 2: val_loss did not improve from inf
249/249 [==============================] - 68s 273ms/step - loss: 2.0244 - gender_output_loss: 0.6930 - boneage_cat_output_loss: 1.3314 - gender_output_accuracy: 0.5356 - boneage_cat_output_accuracy: 0.4191 - val_loss: nan - val_gender_output_loss: 0.6904 - val_boneage_cat_output_loss: nan - val_gender_output_accuracy: 0.5368 - val_boneage_cat_output_accuracy: 0.4194
Epoch 3/50
249/249 [==============================] - ETA: 0s - loss: 2.0020 - gender_output_loss: 0.6904 - boneage_cat_output_loss: 1.3116 - gender_output_accuracy: 0.5396 - boneage_cat_output_accuracy: 0.4245
Epoch 3: val_loss did not improve from inf
249/249 [==============================] - 70s 280ms/step - loss: 2.0020 - gender_output_loss: 0.6904 - boneage_cat_output_loss: 1.3116 - gender_output_accuracy: 0.5396 - boneage_cat_output_accuracy: 0.4245 - val_loss: nan - val_gender_output_loss: 0.6902 - val_boneage_cat_output_loss: nan - val_gender_output_accuracy: 0.5368 - val_boneage_cat_output_accuracy: 0.4194
Epoch 4/50
249/249 [==============================] - ETA: 0s - loss: 1.9959 - gender_output_loss: 0.6904 - boneage_cat_output_loss: 1.3055 - gender_output_accuracy: 0.5426 - boneage_cat_output_accuracy: 0.4244
Epoch 4: val_loss did not improve from inf
249/249 [==============================] - 69s 276ms/step - loss: 1.9959 - gender_output_loss: 0.6904 - boneage_cat_output_loss: 1.3055 - gender_output_accuracy: 0.5426 - boneage_cat_output_accuracy: 0.4244 - val_loss: nan - val_gender_output_loss: 0.6907 - val_boneage_cat_output_loss: nan - val_gender_output_accuracy: 0.5368 - val_boneage_cat_output_accuracy: 0.4194
Epoch 5/50
249/249 [==============================] - ETA: 0s - loss: 1.9988 - gender_output_loss: 0.6917 - boneage_cat_output_loss: 1.3070 - gender_output_accuracy: 0.5425 - boneage_cat_output_accuracy: 0.4252
Epoch 5: val_loss did not improve from inf
249/249 [==============================] - 68s 271ms/step - loss: 1.9988 - gender_output_loss: 0.6917 - boneage_cat_output_loss: 1.3070 - gender_output_accuracy: 0.5425 - boneage_cat_output_accuracy: 0.4252 - val_loss: nan - val_gender_output_loss: 0.6907 - val_boneage_cat_output_loss: nan - val_gender_output_accuracy: 0.5368 - val_boneage_cat_output_accuracy: 0.4194
I don't understand why it's not working properly.