-1

I am pretty new to this and I am writing my bachelor thesis in keras. I have this big CNN, built similar to vgg but a bit different, because I have bigger resolution images and I pool a little more. I added a 2048 dense layer on top. What Dropout do I use. I wanna go with a high dropout since I have very little data (read below) and I added many neurons. But what happens when it is too high?

I am asking because I have limited time and the network takes like 3 days to train. If anyone knows answers or tips in any way, Id be very grateful. Any other recommendations/propositions on what to change or do, what has worked for you, are also very welcome.

thanks in advance! here's how I build my model:

model = Sequential()
model.add(Conv2D(64, (3, 3), strides=1, activation='swish', input_shape = input_shape, trainable=True))
model.add(MaxPooling2D((2, 2), name='pool0'))
model.add(Conv2D(64, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2), name='pool1'))

model.add(Conv2D(128, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(128, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool2'))

model.add(Conv2D(256, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(256, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(256, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool3'))

model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool4'))

model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool5')) 

model.add(Flatten())
model.add(Dense(2048,activation='swish', name='vgg_int'))
model.add(Dropout(0.65))
model.add(Dense(17,activation='softmax')) 

I also wanna add that I have very little data to train from. that is why I want the big dropout. I have around 100 pics per class. sometimes even only 60, sometimes 200:


Found 1807 images belonging to 17 classes.
Found 170 images belonging to 17 classes.

I am confident it can go over 90% on validation-set, but what is the best way to go here, I dont really know. What happens if I go 90% dropout? I currently run 60% but with a smaller model, only 1024 neurons on that top:

Epoch 19/50
226/226 [==============================] - 4966s 22s/step - loss: 0.5661 - accuracy: 0.8307 - val_loss: 0.5752 - val_accuracy: 0.8412
Epoch 20/50
226/226 [==============================] - 4157s 18s/step - loss: 0.5511 - accuracy: 0.8329 - val_loss: 0.5042 - val_accuracy: 0.8647

I am running batch_size = 8 and: optimizer=optimizers.Adam(learning_rate=0.0000015)

again, thanks a lot!

Shaved Man
  • 57
  • 1
  • 10

1 Answers1

2

Dropout is used to prevent overfitting of the model. I can understand why you would want to use high dropout as your dataset is really small. But using a high dropout value is detrimental to your model and will get in the way of your model learning properly. Since you have a validation set, use it to understand whether your model is overfitting. You can stop training your model when there is a large gap between training accuracy and validation accuracy. I recommend you start with a Dropout of 0.5 and gradually increase it, if you feel unsatisfied with your model's performance.

Hari Krishnan
  • 2,049
  • 2
  • 18
  • 29
  • In general there is no optimal value for Dropout ,its all about testing and experimenting. Also if you find your model is still overfiting then you can reduce the architecture of your model as you have little training data – Bradley Juma Oct 21 '20 at 22:28