Some research papers mention that they used outputs of conv3, conv4, conv5 outputs of a VGG16 network trained on Imagenet
If I display the names of the layers of VGG16 like so:
base_model = tf.keras.applications.VGG16(input_shape=[h, h, 3], include_top=False)
base_model.summary()
I get layers with different names eg.
input_1 (InputLayer) [(None, 512, 512, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 512, 512, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 512, 512, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 256, 256, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 256, 256, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 256, 256, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 128, 128, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 128, 128, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 128, 128, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 128, 128, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 64, 64, 256) 0
.....
So which layers do they mean by conv3, conv4, conv5 ? Do they mean the 3rd, 4th, 5th convolutional layers before each pooling (since vgg16 has 5 stages)?