0

I'm working on image segmentation using a Convolutional Neural Network (cnn) implemented in Tensorflow. I have two classes and I am using cross entropy as loss function and as Adam optimizer. I am training the network with around 150 images.

During the training I'm seeing this periodical pattern, the training loss descends until it has a couple high values, then it quickly descends to the previous level.

training loss chart

A similar pattern can be also observed in the validation loss, the validation loss drops periodically for a couple epochs and then goes back to the previous level.

validation loss chart

By decreasing the learning rate, this patterns are not visible anymore, but the losses are higher and the Interception over Union(IoU) is much lower.

Edit: I've found I had an image twice with two different slightly different labels. I also noticed this pattern is related to drop_out, after the point where the training images are learned to a 100%, the drop_out produces that in some iterations the training error increases a lot and this causes the peaks. Has someone experienced something like this with dropout?

Has someone seen patterns like this? What could be the reasons for this?

Decano
  • 93
  • 6
  • Are you properly shuffling your training data? Something like this can happen if the data is not shuffled: Your model will only see inputs of a certain class and get used to those (getting better, reducing loss), and at some point training reaches the point in the data where it switches to the other class, which will heavily reduce performance (since the model is biased towards the other class at that point). – xdurch0 Jun 05 '18 at 14:06
  • Yes, I shuffle the data using `random.shuffle`, I could try changing the random seed to see if that changes something. I have two images per batch, and when an epoch is finished I have trained with all my training images. I thought the problem could be one particular image, but the network sees all the images in every epoch. – Decano Jun 05 '18 at 15:35
  • I found an answer for this, I had one duplicated image in my dataset with a slightly different labeling, because of this the dataset was inconsistent and it was impossible to perfectly learn the dataset. After changing this there are still some apparently periodical peaks, but only in the first 200 epochs. – Decano Jun 13 '18 at 13:16

1 Answers1

0

There is a repeated image in the dataset, the image is labeled twice and the labeling is not 100% exactly like the other, so the model can't converge because the image dataset isn't consistent. When it's close to converge, it changes to try to adjust to the other image, and this process keeps on.

Decano
  • 93
  • 6