SGD optimiser graph

Question

I just wanted to ask a quick question. I understand that val_loss and train_loss is insufficient to tell if the model is overfitting. However, i wish to use it as a rough gauge by monitoring if the val_loss is increasing. As i use SGD optimiser, i seem to have 2 different trends based on the smoothing value. Which should i use? Blue is val_loss and Orange is train_loss.

From smoothing = 0.999, both seems to be decreasing but from smoothing = 0.927, val_loss seems to be increasing. Thank you for reading!

Also, when is a good time to decrease the learning rate? Is it directly before the model overfits?

Smoothing = 0.999

Smoothing = 0.927

score 0 · Answer 1 · answered Jan 08 '20 at 18:10

In my experience with DL as applied to CNNs, overfitting is tied more to the difference in train/val accuracies/losses rather than just one or the other. In your graphs, it's clear that the difference in loss is increasing as time goes on, showing that your model does not generalize well to the dataset, and hence shows signs of overfitting. It would also help for you to track classification accuracy on train and val datasets if possible--this will show you the generalization error which acts as a similar metric but might show more visible effects.

Dropping the learning rate once the loss starts to even out and overfitting begins is a good idea; however you may find better gains for your generalization if you adjust the net's complexity to better fit the dataset first. For such overfitting, a modest decrease in complexity may help--use the difference in train/val losses and accuracies to confirm.

Thank you for your explaination! I am pretty new to this. Could you kindly explain on how would i go about reducing the complexity? — Brandon Speedster Loo, Jan 08 '20 at 19:00
@BrandonSpeedsterLoo Remove some layers or channels. It depends on your net's structure. — nanofarad, Jan 08 '20 at 19:04

SGD optimiser graph

1 Answers1