6

I want to feed images with the shape (160,320,3) to

 VGG16(input_tensor=input_tensor, include_top=False)   

How can I include a layer that reshapes the images to the shape expected by the VGG16 model, which is (224,224,3) ?

Oblomov
  • 8,953
  • 22
  • 60
  • 106

3 Answers3

22

VGG16 model in itself is just a set of weights of the fixed sequence of layers and fixed convolution kernel sizes etc. That doesn't mean that those convolution kernels cannot be applied to images of other sizes.

For example in your case:

from keras.models import Model
from keras.layers import Dense,Flatten
from keras.applications import vgg16
from keras import backend as K

model = vgg16.VGG16(weights='imagenet', include_top=False, input_shape=(160,320,3))
model.summary(line_length=150)

flatten = Flatten()
new_layer2 = Dense(10, activation='softmax', name='my_dense_2')

inp2 = model.input
out2 = new_layer2(flatten(model.output))

model2 = Model(inp2, out2)
model2.summary(line_length=150)

According to here the minimum image size can be 48x48x3 anything above than that is fine.

Now its true the original weights were learnt on 224,224,3 shaped images but the filters weights act as very good starting point for new tasks with new set of images. You do need to re-train the network but the network would converge very quickly. This is the basis of transfer learning.

nemo
  • 55,207
  • 13
  • 135
  • 135
indraforyou
  • 8,969
  • 3
  • 49
  • 40
  • 1
    The minimum image size [was updated to 32x32](https://github.com/keras-team/keras/blob/f630ad87a01ed2b4d08f91e5553b50c6a85601f6/keras/applications/vgg16.py#L90). – Alaa M. Jul 06 '21 at 07:37
1

There are two things that you need to do:

  1. Explicitly declare the input shape to have variable sized inputs by defining None for the image width and height.
  2. Donot use flatten() as it relies on the fixed input shape. Instead use GlobalMaxPooling that will not only do the adaptive pooling, but also flatten the input tensor for the FC to work on.

I hope this will help you achieve what you what.

Kashif
  • 79
  • 1
  • 1
0

You can use resize() function of Opencv library.

 import cv2
    width = int(224)
    height = int(224)
    dim = (width, height)
    '''images contains original dimension image array'''
    resized_images=[]
    for i in range(0,images.shape[0]):
           resized = cv2.resize(images[i], dim, interpolation = cv2.INTER_AREA)
           resized_images.append(resized)
  • such approach artificially increases computational costs without much or any benefit for learning capabilities. As far as i know, interpolated pixels won't add any value. – Bartek Wójcik Dec 10 '20 at 12:33