Issue with transfer learning with Tensorflow and Keras

Question

I've been trying to recreate the work done in this blog post. The writeup is very comprehensive and code is shared via a collab.

What I'm trying to do is extract layers from the pretrained VGG19 network and create a new network with these layers as the outputs. However, when I assemble the new network, it highly resembles the VGG19 network and seems to contain layers that I didn't extract. An example is below.

import tensorflow as tf
from tensorflow.python.keras import models

## Create network based on VGG19 arch with pretrained weights
vgg = tf.keras.applications.vgg19.VGG19(include_top=False, weights='imagenet')
vgg.trainable = False

When we look at the summary of the VGG19, we see an architecture that we expect.

vgg.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 20,024,384
Trainable params: 0
Non-trainable params: 20,024,384
_________________________________________________________________

Then, we extract the layers and create a new model

## Layers to extract
content_layers = ['block5_conv2'] 
style_layers = ['block1_conv1','block2_conv1','block3_conv1','block4_conv1','block5_conv1']
## Get output layers corresponding to style and content layers 
style_outputs = [vgg.get_layer(name).output for name in style_layers]
content_outputs = [vgg.get_layer(name).output for name in content_layers]
model_outputs = style_outputs + content_outputs

new_model = models.Model(vgg.input, model_outputs)

When new_model is created, I believe we should have a much smaller model. However, a summary on the model shows that the new model is very close to the original model (it contains 19 of the 22 layers from VGG19) and it contains layers that we didn't extract.

new_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
=================================================================
Total params: 15,304,768
Trainable params: 15,304,768
Non-trainable params: 0
_________________________________________________________________

So my questions are...

Why are layers that I didn't extract showing up in new_model. Are these being inferred by the model's instantiation process per the docs? This just seems too close to the VGG19 architecture to be inferred.
From how I understand Keras' Model (functional API), passing multiple output layers should create a model with multiple outputs, however, it seems that the new model is sequential and only has a single output layer. Is this the case?

score 2 · Accepted Answer · answered Oct 03 '18 at 03:47

Why are layers that I didn't extract showing up in new_model.

That's because when you create a model with models.Model(vgg.input, model_outputs) the "intermediate" layers between vgg.input and the output layers are included as well. This is the intended way as VGG is constructed this way.

For example if you were to create a model this way: models.Model(vgg.input, vgg.get_layer('block2_pool') every intermediate layer between the input_1 and block2_pool would be included since the input has to flow through them before reaching block2_pool. Below is a partial graph of VGG that could help with that.

Now, -if I've not misunderstood- if you want to create a model that doesn't include those intermediate layers (which would probably work poorly), you have to create one yourself. Functional API is very useful on this. There are examples on the documentation but the gist of what you want to do is as below:

from keras.layers import Conv2D, Input

x_input = Input(shape=(28, 28, 1,))
block1_conv1 = Conv2D(64, (3, 3), padding='same')(x_input)
block2_conv2 = Conv2D(128, (3, 3), padding='same')(x_input)
...

new_model = models.Model(x_input, [block1_conv1, block2_conv2, ...])

... however, it seems that the new model is sequential and only has a single output layer. Is this the case?

No, your model has multiple outputs as you intended to. model.summary() should have display which layers are connected to what (which would help understanding the structure), but I believe there is a small bug with some versions that prevents that. In any case you can see that your model have multiple outputs by checking new_model.output, that should give you:

[<tf.Tensor 'block1_conv1/Relu:0' shape=(?, ?, ?, 64) dtype=float32>,
 <tf.Tensor 'block2_conv1/Relu:0' shape=(?, ?, ?, 128) dtype=float32>,
 <tf.Tensor 'block3_conv1/Relu:0' shape=(?, ?, ?, 256) dtype=float32>,
 <tf.Tensor 'block4_conv1/Relu:0' shape=(?, ?, ?, 512) dtype=float32>,
 <tf.Tensor 'block5_conv1/Relu:0' shape=(?, ?, ?, 512) dtype=float32>,
 <tf.Tensor 'block5_conv2/Relu:0' shape=(?, ?, ?, 512) dtype=float32>]

Printing it sequentially in new_model.summary() is just a design choice, as it would get hairy with complex models.

Issue with transfer learning with Tensorflow and Keras

1 Answers1