0

I've been running the EfficientNet code from Google on my own image datasets and have run into the following problem. For each variant of the architecture (b0 to b7) the training and validation loss decrease up until +/- 100 epochs. After which both start to increase rapidly whilst the validation accuracy does the inverse.

I've not seen this pattern anywhere before. My suspicion is that it is because of overfitting but then wouldn't the training loss continue to decrease?

Looking at other SO questions, this one comes closes to what I mean but I'm not sure. If this is a vanishing gradient problem then how come the folks at Google didn't experience it with ImageNet data?

Setup

This has been run using the EfficientNet tutorial. My dataset consists of 41k images for train and 5k images for validation.

Naor Tedgi
  • 5,204
  • 3
  • 21
  • 48
Adnan Fiaz
  • 31
  • 5
  • what are you trying to achive on this retrain please explain? and also EfficientNet got output Dense of 1000 so you got for each class 40 image for train and 5 image for validation ? – Naor Tedgi Nov 06 '19 at 05:56
  • hmmm..good point. I've since moved on to other work so I can't for sure say. I'll let you know if I can dig up an answer. – Adnan Fiaz Nov 08 '19 at 09:06

0 Answers0