I'm trying to fine-tune a modified InceptionV3 model in Keras.
I follow the example "Fine-tune InceptionV3 on a new set of classes" on this page.
So I first trained the top dense layers that were added to the InceptionV3 base model with the following code:
model = Model(inputs=base_model.input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = False
parallel_model = multi_gpu_model(model, gpus=2)
parallel_model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
history = parallel_model.fit_generator(generate_batches(path), steps_per_epoch = num_images/batch_size, epochs = num_epochs)
After that, I try to fine-tune the top 2 inception blocks from InceptionV3. And according to the example, what I should do is:
for layer in model.layers[:249]:
layer.trainable = False
for layer in model.layers[249:]:
layer.trainable = True
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
model.fit_generator(...)
But I'm using the multi_gpu_model
, so I don't know how to freeze the first 249 layers.
I mean, if I freeze the layers in the no-gpu model (like the example), and use parallel_model = multi_gpu_model(model, gpus=2)
to freeze the layers in the parallel_model
, then the weights in the top dense layers that were just trained and contained in the parallel_model
will be overwritten, right?
On the other hand, I tried to directly use for layer in parallel_model.layers[:249]: layer.trainable = False
, but when I checked the layers in the parallel_model
, it showed:
for i, layer in enumerate(parallel_model.layers):
print(i, layer.name)
(0, 'input_1')
(1, 'lambda_1')
(2, 'lambda_2')
(3, 'model_1')
(4, 'dense_3')
So what are the 'lambda_1', 'lambda_2' and 'model_1' layers and why it only shows 5 layers in the parallel_model
?
More importantly, how to freeze the layers in the parallel_model
?