0
import numpy as np
from PIL import Image
from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input

To create the VGG19 model I use:

img = Input(shape=(256,256,3))
vgg = VGG19(weights="imagenet")
vgg.outputs = [vgg.get_layer('block4_conv1').output]
model = Model(inputs=img, outputs=vgg(img))

Then in the model.summary() I see that:

block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160

My expected dimensions are (28,28,512).

To load the image into the network I use:

img = image.load_img("./path-to-image.jpeg", target_size=(256, 256))
img = preprocess_input(np.array(img))

However, when I put my image through the model, my output dimension is (1, 32, 32, 512) and it makes no sense as to why this happens!

To get output dimensions I run:

img_out = \
    model.predict(
        np.expand_dims(img, 0), 
        batch_size=1
    )

img_out.shape
>>> (1, 32, 32, 512) != (28,28,512)
GRS
  • 2,807
  • 4
  • 34
  • 72

1 Answers1

0

VGG19 takes input (224, 224, 3) by default. If you apply 3 max-pooling layers to it, you will get (28,28, num_kernels).

But, your input is (256, 256, 3). So, if you apply 3 max-pooling layers to this, you will end up getting (32, 32, num_kernels).

After 1st max-pool layer - (128, 128, num_kernels)
After 2nd max-pool layer - (64, 64, num_kernels)
After 3rd max-pool layer - (32, 32, num_kernels)

Illuminati0x5B
  • 602
  • 7
  • 24