1

I have a question about how to calculate val_loss in Keras' multiple output. Here is an excerpt of my code.

nBatchSize  = 200
nTimeSteps  = 1
nInDims     = 17
nHiddenDims = 10
nFinalDims  = 10
nOutNum     = 24
nTraLen     = 300
nMaxEP      = 20
nValLen     = 50
sHisCSV     = "history.csv"

oModel = Sequential()
oModel.add(Input(batch_input_shape=(nBatchSize, nTimeSteps, nInDims)))
oModel.add(LSTM(nHiddenDims, return_sequences=True,  stateful=True))
oModel.add(LSTM(nHiddenDims, return_sequences=False, stateful=True))
oModel.add(Dense(nFinalDims, activation="relu")
oModel.add(Dense(nOutNum,    activation="linear")
oModel.compile(loss="mse", optimizer=Nadam())

oModel.reset_states()
oHis = oModel.fit_generator(oDataGen, steps_per_epoch=nTraLen,
epochs=nMaxEP, shuffle=False,
validation_data=oDataGen, validation_steps=nValLen,
callbacks=[CSVLogger(sHisCSV, append=True)])

# number of cols is nOutNum(=24), number of rows is len(oEvaGen)
oPredDF = pd.DataFrame(oPredModel.predict_generator(oEvaGen, steps=len(oEvaGen))

# GTDF is a dataframe of Ground Truth
nRMSE   = np.sqrt(np.nanmean(np.array(np.power(oPredDF - oGTDF, 2))))

In history.csv, val_loss is written and it is written as 3317.36. The RMSE calculated from the prediction result is 66.4.

By my understanding my Keras specification, val_loss written in history.csv is the mean MSE of 24 outputs. Assuming that it is correct, RMSE can be computed as 11.76 (= sqrt(3317.36/24)) from history.csv, which is quite different from value of nRMSE (=66.4) Just as sqrt(3317.36) = 57.6 is rather close to it.

Is my understanding of Keras specification on val_loss incorrect?

shamada
  • 21
  • 4

1 Answers1

1

Your first assumption is correct, but the further derivation went wrong a bit.
As the MSE is the mean of the model's output's squared errors, as you can see in the Keras documentation:

mean_squared_error
keras.losses.mean_squared_error(y_true, y_pred)

and in the Keras source code:

K.mean(K.square(y_pred - y_true), axis=-1)

thus the RMSE is the square root of this value:

K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))

What you wrote would be the root square of the square error i.e. RSE.

So from your actual example:
RSE can be computed as sqrt(3317.36/24) = 11.76
RMSE can be computed as sqrt(3317.36) = 57.6

Thus the RMSE (and nRMSE) values provided by the model are correct.

Geeocode
  • 5,705
  • 3
  • 20
  • 34
  • Thank you very much for your kind advice. It's all clear now. In my model, the output is not a scalar but a 24-dimensional vector, so I thought it would be handled as 24 multiple outputs in Keras. And I thought rmse1, rmse2..rmse23 would be calculated for each output y1, y2 ... y23, and val_loss be the sum of them (= rmse1 + rmse2 .. rmse23). But thanks to your advice, I was able to notice that it was treated as a single output. – shamada Dec 05 '18 at 01:54
  • @shamada You're welcome! If you found my answer correct, please accept my answer if useful please upvote it. – Geeocode Dec 05 '18 at 07:31