Maybe too late to answer but better late than never.
It depends on your dataset if you have too few or too many samples. Generally, a pre-trained model is suggested when you have a limited amount of data and/or want to avoid overfitting while extracting most of the features of your samples for higher accuracy.
If you are using Keras try to go with VGG16:
conv_net = VGG16(weights="imagenet",
include_top=False,
input_shape=(150, 150, 3)) # Change the shape accordingly
It gives you such a layers stack:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 150, 150, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 150, 150, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 150, 150, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 75, 75, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 75, 75, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 75, 75, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 37, 37, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 37, 37, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 37, 37, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 37, 37, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 18, 18, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 18, 18, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 18, 18, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 9, 9, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 9, 9, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 9, 9, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 9, 9, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 4, 4, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
To use this model you have two choices, one is to extract the features using only this model and save them on disk, and in the next step create your Densely connected layers and feed the output of the previous step to the model. This approach is much faster than the next one I'm going to explain but the only drawback is you can't use data augmentation. This is how you can extract the features using the predict
method of conv_net
:
features_batch = conv_base.predict(inputs_batch)
# Save the features in a tensor and feed them to the Dense Layer after all has been extracted
The second choice is to attach your Densely connected model at top of the VGG model, Freez the conv_net
layers and feed your data normally to the network, this way you can use data augmentation but only use it when you have access to a powerful GPU or Cloud. Here is a code of how to freeze and connect your Dense layer on top of VGG:
#codes adopted from "Deep Learning with Python" book
from keras import models
from keras import layers
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
You can even fine tune the model by unfreezing one of the layers of conv_net
to adapt with your data. Here is how you can freeze all layers but one:
conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
if layer.name == 'block5_conv1':
set_trainable = True
if set_trainable:
layer.trainable = True
else:
layer.trainable = False
# your model like before
Hope it helps you get started.