Why is the mse as a loss different from the mse as a metric in keras?

Question

I have a regression model built in keras. The loss is mse. The output during training is as follows:

4/4 [==============================] - 16s 1s/step - loss: 21.4834 - root_mean_squared_error: 4.6350 - full_mse: 23.5336 - mean_squared_error: 23.5336 - val_loss: 32.6890 - val_root_mean_squared_error: 5.7174 - val_full_mse: 32.6890 - val_mean_squared_error: 32.6890

Why is the mse as a loss different from the mse as a metric? (loss = 21.4834; mse = 23.5336; why do these values differ? They ought to be the same.)

And why is this only the case for the traing set, not the validation set? (val_loss = 32.6890; val_mse = 32.6890; these values are equal, as it ought to be.)

Any ideas?

I'm not sure how your mse_loss (esp your val_loss) is calculated. But "loss" is probably an average over the training (where the weights are changing) and "mse" is calculated after the epoch (without weights changing). val_loss and val_mse are both calculated without weight updates. — Markus, Oct 10 '22 at 23:17
I believe this is the answer. I have checked it by setting epoch and batch size to 1. Then I get the same values. Thanks. How can I vote for this answer? I think you have to resubmit your comment as an answer. — Boris Reif, Oct 10 '22 at 23:24
In addition to answer your question. I am not sure myself how keras/tensorflow computes MSE. However, your ideea about averaging seems plausible. In addition to the Keras MSE I have written a diy loss/metric function: def full_mse(y_true, y_pred): return K.mean(K.square(y_pred-y_true)) When used as a metric it gives the same output as tf.keras.losses.MeanSquaredError() used as a metric. So i take it that this is how the MSE is computed. — Boris Reif, Oct 10 '22 at 23:32

score 1 · Accepted Answer · answered Oct 10 '22 at 23:36

1

I'm posting this as answer as it looks like it was the solution of the problem.

The training MSE loss ("loss") is calculated as a form of average over training, where the weights are changing. "metric" MSE ("mse") is calculated after the epoch without weights updating.

For validation ("val_loss" and "val_mse") both are calculated without weight updates.

Additionally it's possible that the shown MSE loss is something like a moving average, where not all minibatches of the the epoch are weighted equally. I don't think this is the case for the given problem as the validation values match. This depends on the implementation.

answered Oct 10 '22 at 23:36

Markus

91
5

No, this is incorrect, metrics computed during training are also a moving average. – Dr. Snoopy Oct 11 '22 at 08:20
So what do you think accounts for the difference? – Boris Reif Oct 12 '22 at 12:03
I think what he wants to say is that my assumption "I don't think this is the case for the given problem" is incorrect for the default keras implementation. – Markus Oct 14 '22 at 17:07

Why is the mse as a loss different from the mse as a metric in keras?

1 Answers1