0

I'm training a resnet model with Keras, fine tuned on my own images. While training, Tensorboard is constantly reporting a validation loss that seems unrelated to training loss (much higher, see image below where train is orange line and validation blue line). Furthermore when training is finished (for example final losses as reported by Tensorboard could be respectively 0.06 and 0.57) I evaluate the model "manually" and validation loss seems to be in the same range of training loss (ex:0.07).

enter image description here

I suspect that preprocessing could be the reason of this strange result. Essentially the inputs and the outputs of the model are created like this:


inp = tf.keras.Input(input_shape)
resnet = tf.keras.applications.ResNet50V2(include_top=False, input_shape=input_shape, input_tensor=inp,pooling="avg")

# Add ResNet50V2 specific preprocessing method into the model.
preprocessed = tf.keras.layers.Lambda(lambda x: tf.keras.applications.resnet_v2.preprocess_input(x))(inp)
out = resnet(preprocessed)
out = tf.keras.layers.Dense(num_outputs, activation=None)(out)

and the training :

model.compile(
        optimizer=tf.keras.optimizers.Adam(lrate),
        loss='mse',
        metrics=[tf.keras.metrics.MeanSquaredError()],
    )

model.fit(
        train_dataset,
        epochs=epochs,
        validation_data=val_dataset,
        callbacks=callbacks
    )

It's like if preprocessing does not occur when validation loss is calculated but I don't know why.

Patrick
  • 2,577
  • 6
  • 30
  • 53
  • There are no guarantees for validation performance/loss, so your expectation is not realistic, and it does not mean that the validation loss is "wrong". – Dr. Snoopy Jul 13 '22 at 20:19
  • Tensorboard val_loss seems to be wrong as it is not the same when I take the Tensorboard final value and when I do the evaluation manually after model has been saved – Patrick Jul 13 '22 at 20:30
  • You said 0.6, that is close to 0.57, also what does manually mean exactly? Always include code. – Dr. Snoopy Jul 13 '22 at 20:32
  • Sorry for the typo (edited), it should be 0.07, not 0.6. So final val_loss value as reported by TB is 0.57 but validation loss as calculated with model.evaluate on validation dataset is 0.07 (which is near final training loss) – Patrick Jul 13 '22 at 20:35
  • 1
    Do these numbers match with the Keras log in the console? – Dr. Snoopy Jul 13 '22 at 22:52
  • You were right. Validation loss reported was correct, the problem was that I tried to do transfer learning without setting trainable parameter to False (=> overfitting). – Patrick Jul 14 '22 at 17:58

0 Answers0