Hello guys I could use some advice on whether my approach that I employed in order to apply transfer learning on the resNet50 model is correct, after reading many articles and resources online, it is hard to say if the method I adopted is correct. I should mention that I am using 500 images/labels (with labels ranging from 0-25) to run my model. Let us first go over the first section of building the model, please find the code below:
X_train, X_test, y_train, y_test = train_test_split(files, labels, test_size=0.2)
X_train = np.array(X_train)
X_test = np.array(X_test)
y_train = np.array(y_train)
y_test = np.array(y_test)
input_t = (224, 224, 3)
resnet = ResNet50(input_shape=input_t, weights='imagenet', include_top=False)
for layer in resnet.layers:
layer.trainable = False
So I create my train/test groups and instantiate my resNet50 model. Then I freeze the layers of my model such that they do not have to be trained, however it is unclear to me whether I should freeze all the layers or only a partial amount.
Let us now move on to the next section, please find the code below:
model = tensorflow.keras.models.Sequential()
model.add(tensorflow.keras.layers.Lambda(lambda image: tensorflow.image.resize(image, to_res)))
model.add(resnet)
model.add(tensorflow.keras.layers.Flatten())
model.add(tensorflow.keras.layers.BatchNormalization())
model.add(tensorflow.keras.layers.Dense(256, activation='relu'))
model.add(tensorflow.keras.layers.Dropout(0.5))
model.add(tensorflow.keras.layers.BatchNormalization())
model.add(tensorflow.keras.layers.Dense(128, activation='relu'))
model.add(tensorflow.keras.layers.Dropout(0.5))
model.add(tensorflow.keras.layers.BatchNormalization())
model.add(tensorflow.keras.layers.Dense(64, activation='relu'))
model.add(tensorflow.keras.layers.Dropout(0.5))
model.add(tensorflow.keras.layers.BatchNormalization())
model.add(tensorflow.keras.layers.Dense(26, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=32, epochs=5, verbose=1, validation_data=(X_test, y_test))
In this section, I essentially add additional layers to the resNet50 model in order to train them on my data. At the end I use a softmax
activation function since my labels range from 0-25, and then finish off by fitting the model on my data.
Please let me know if there are things you agree/disagree with, or any kind of tips/recommendations are also welcomed. Thanks for reading.