Validation loss in keras while training LSTM and stability of LSTM

Question

I am using Keras now to train my LSTM model for a time series problem. My activation function is linear and the optimizer is Rmsprop. However, i observe the tendency that while the training loss is decreasing slowly overtime, and fluctuate around a small value, the validation loss jumps up and down with a large variance.

Therefore, I come up with two questions: 1. Does the validation loss affect the training process? Will the algorithm look at the validation loss and slow down the learning rate in case it fluctuates alot? 2. How can i make the model more stable so that it will return a more stable values of validation loss?

Thanks

DJK · Answer 1 · 2021-03-02T13:57:50.837

Does the validation loss affect the training process?

No. The validation loss is just a small sample of data that is excluded from the training process. It is run through the network at the end of an epoch, to test how well training is going, so that you can check if the model is over fitting (i.e. training loss much < validation loss).

Fluctuation in validation loss

This is bit tougher to answer without the network or data. It could just mean that your model isn't converging well to unseen data, meaning that its not seeing a enough similar trends from training data to validation data, and each time the weights are adjusted to better suit the training data, the model becomes less accurate for the validation set. You could possibly turn down the learning rate, but if your training loss is decreasing slowly, the learning rate is probably fine. I think in this situation, you have to ask yourself a few questions. Do I have enough data? Does a true time series trend exist in my data? Have I normalized my data correctly? Is my network to large for the data I have?

score 1 · Answer 2 · answered May 24 '21 at 12:56

I had this issue - while training loss was decreasing, the validation loss was not decreasing. I checked and found while I was using LSTM:

I simplified the model - instead of 20 layers, I opted for 8 layers.
Instead of scaling within range (-1,1), I choose (0,1), this right there reduced my validation loss by magnitude of one order
I reduced the batch size from 500 to 50 (just trial and error)
I added more features, which I thought intuitively would add some new intelligent information to the X->y pair

score 0 · Answer 3 · edited Jun 20 '20 at 09:12

Possible reasons:

Your validation set is very small compare to your trainning set which usually happens. A little change of weights makes validation loss fluctuate much more than trainning loss. This may not neccessary mean that your model is overfiting. As long as the overall trendency of validation loss keeps decreasing.
May be your train and validation data are from different sources, they may have different distributions. This may happen when your data is time series, and you split your train/validation data by a specific timestamp.

Does the validation loss affect the training process?

No, validation(forward-pass-once) and training(forward-and-backward) are different processes. Hence a single forword pass does not change how would you train next.

Will the algorithm look at the validation loss and slow down the learning rate in case it fluctuates alot?

No, But I guess you can implement your own method to do so. However, one thing should be noted, the model is trying to learn the best solution to your cost function which are fed by trainning data only, so changing this learning rate by observing validation loss doesnt make too much sense.

How can i make the model more stable so that it will return a more stable values of validation loss?

The reasons are expained above. If it is the first case, enlarge validation set will make your loss looks more stable but it does NOT mean it fits better. My suggestion is as long as your are sure your model does not overfit (gap between train loss and validation loss are not too large ), you can just save the model which gives the lowest validation loss.

If its the second case, it can be complecated depend on your case. You could try to exclude samples in trainning set which are not "similar" with your validation set, or enlarge your model's capacity if you have enough data. Or perhapes add more metrics to monitor how well the training.

Validation loss in keras while training LSTM and stability of LSTM

3 Answers3

Possible reasons:

Does the validation loss affect the training process?

Will the algorithm look at the validation loss and slow down the learning rate in case it fluctuates alot?

How can i make the model more stable so that it will return a more stable values of validation loss?