1

My validation loss decreases at a good rate for the first 50 epoch but after that the validation loss stops decreasing for ten epoch after that. I'm using mobilenet and freezing the layers and adding my custom head. my custom head is as follows:

def addTopModelMobileNet(bottom_model, num_classes):

top_model = bottom_model.output
top_model = GlobalAveragePooling2D()(top_model)
top_model = Dense(64,activation = 'relu')(top_model)
top_model = Dropout(0.25)(top_model)
top_model = Dense(32, activation = 'relu')(top_model)
top_model = Dropout(0.10)(top_model)
top_model = Dense(num_classes, activation = 'softmax')(top_model)

return top_model

i'm using alpha 0.25, learning rate 0.001, decay learning rate / epoch, nesterov momentum 0.8. I'm also using earlystoping callback with patience of 10 epoch.

Lossaccuracy

desertnaut
  • 57,590
  • 26
  • 140
  • 166
kuro
  • 194
  • 9

2 Answers2

2

I almost certainly face this situation every time I'm training a Deep Neural Network:

  • You could fiddle around with the parameters such that their sensitivity towards the weights decreases, i.e, they wouldn't alter the already "close to the optimum" weights. Some of these parameters could include the alpha of the optimizer, try decreasing it with gradual epochs. Momentum can also affect the way weights are changed.

  • You could even gradually reduce the number of dropouts.

  • Sorry I'm new to this could you be more specific about how to reduce the dropout gradually. – kuro Mar 16 '21 at 07:08
  • Yea sure, try training different instances of your neural networks in parallel with different dropout values as sometimes we end up putting a larger value of dropout than required. Keep experimenting, that's what everyone does :) – Apurv Singh Mar 16 '21 at 07:23
1

This phenomenon is called over-fitting. At around 70 epochs, it overfits in a noticeable manner.

There are a few reasons behind these.

  1. Data: Please analyze your data first. Balance the imbalanced data. Use augmentation if the variation of the data is poor.

  2. Layer tune: Try to tune dropout hyper param a little more. I would suggest you try adding the BatchNorm layer too.

  3. Finally, try decreasing the learning rate to 0.0001 and increase the total number of epochs. Do not use EarlyStopping at this moment. Look at the training history. Sometimes global minima can't be reached because of some weird local minima.

Nazmul Hasan
  • 860
  • 6
  • 17
  • The problem is that the data is from two different source but I have balanced the distribution applied augmentation also. I normalized the image in image generator so should I use the batchnorm layer? Okay will decrease the LR and not use early stopping and notify. Can you be more specific about the drop out. – kuro Mar 16 '21 at 06:34
  • 1
    I'm really sorry for the late reply. 1. yes, still please use batch norm layer. 2. Start dropout rate from the higher rate. Then decrease it according to the performance of your model. – Nazmul Hasan Mar 17 '21 at 10:05
  • how do I decrease the dropout after a fixed amount of epoch i searched for callback but couldn't find any information can you please elaborate. – kuro Mar 17 '21 at 18:12
  • 1
    actually, you can not change the dropout rate during training. You can change the LR but not the model configuration. I was talking about retraining after changing the dropout. – Nazmul Hasan Mar 18 '21 at 01:10
  • Okay thanks . I have changed the no of epoch as you said and removed the dropout right before the softmax this have improved the performance. I'll try to do that also. Thank you – kuro Mar 18 '21 at 05:22
  • 1
    Great. Please accept this answer if it helped. – Nazmul Hasan Mar 18 '21 at 05:25
  • hello there if i change the dropout do i have to compile the model because if I do it starts the training from infinity again. – kuro Mar 22 '21 at 02:56