8

I'm using keras' pre-trained model VGG16, following this link: Keras VGG16 I'm trying to decode the prediction output into word of what's in the image:

model = VGG16(weights='imagenet', include_top=False)
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)
(inID, label) = decode_predictions(features)[0]   #ERROR HERE

The full error is:

ValueError: decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 7, 7, 512)

Any comments or suggestion is highly appreciated. Thank you.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
matchifang
  • 5,190
  • 12
  • 47
  • 76

2 Answers2

15

You should change a first line to:

model = VGG16(weights='imagenet', include_top=True)

Without this line your model is producing a 512 feature maps with size of 7 x 7 pixels. This the reason behind your error.

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
1

Just to add on the correct answer by @Marcin Możejko

The same applies to the other available models, so you must always include the top three layers:

vgg19 <- application_vgg19(include_top = TRUE, weights = "imagenet")

model_resnet50 <- application_resnet50(include_top = TRUE, weights = "imagenet")

model_inception_v3 <- application_inception_v3(include_top = TRUE, weights = "imagenet")

model_xception <- application_xception(include_top = TRUE, weights = "imagenet")
Agile Bean
  • 6,437
  • 1
  • 45
  • 53