training VGG16 from scratch

Question

I was trying to train my VGG16 network from scratch. for this, I downloaded the architecture from https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3

One of the authors has written the code as vgg-16_keras.py code. In this code, the default image size expected was 224x224. My input images also had the same sizes. So, there was no issue with the size.

Next I made some slight changes so that I have the architecture ready to train my model on some sample images that I have at hand. When I tried to train my model, I am getting "negative dimension" error. In order to debug the code, I tried to get some function that was giving me the output dimensions of different layers but unfortunately there wasn't one.

I am posting my code as well as the error message

import keras
import numpy as np
from keras import backend as K
from keras.models import Sequential
from keras.layers import Activation, ZeroPadding2D, Convolution2D, MaxPooling2D, Dropout
from keras.layers.core import Dense, Flatten
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy
from keras.preprocessing.image import ImageDataGenerator
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import *
from matplotlib import pyplot as plt
from sklearn.metrics import confusion_matrix
import itertools
from matplotlib.pyplot import *


train_path="cats-and-dogs/train"
valid_path="cats-and-dogs/valid"
test_path="cats-and-dogs/test"

train_batches=ImageDataGenerator().flow_from_directory(train_path, target_size=(224,224), classes=['dog','cat'], batch_size=20)
valid_batches=ImageDataGenerator().flow_from_directory(valid_path, target_size=(224,224), classes=['dog','cat'], batch_size=10)
test_batches=ImageDataGenerator().flow_from_directory(test_path, target_size=(224,224), classes=['dog','cat'], batch_size=10)

imgs,labels=next(train_batches)

#Defining individual layers for oour CNN

l1=ZeroPadding2D((1,1),input_shape=(3,224,224))
l2=Convolution2D(64, 3, activation='relu')
l3=ZeroPadding2D((1,1))
l4=Convolution2D(64, 3, activation='relu')
l5=MaxPooling2D((2,2), strides=(2,2))

#
#
l6=ZeroPadding2D((1,1))
l7=Convolution2D(128, 3, activation='relu')
l8=ZeroPadding2D((1,1))
l9=Convolution2D(128, 3, activation='relu')
l10=MaxPooling2D((2,2), strides=(2,2))

l11=ZeroPadding2D((1,1))
l12=Convolution2D(256, 3, 3, activation='relu')
l13=ZeroPadding2D((1,1))
l14=Convolution2D(256, 3, 3, activation='relu')
l15=ZeroPadding2D((1,1))
l16=Convolution2D(256, 3, 3, activation='relu')
l17=MaxPooling2D((2,2), strides=(2,2))

l18=ZeroPadding2D((1,1))
l19=Convolution2D(512, 3, 3, activation='relu')
l20=ZeroPadding2D((1,1))
l21=Convolution2D(512, 3, 3, activation='relu')
l22=ZeroPadding2D((1,1))
l23=Convolution2D(512, 3, 3, activation='relu')
l24=MaxPooling2D((2,2), strides=(2,2))

l25=ZeroPadding2D((1,1))
l26=Convolution2D(512, 3, 3, activation='relu')
l27=ZeroPadding2D((1,1))
l28=Convolution2D(512, 3, 3, activation='relu')
l29=ZeroPadding2D((1,1))
l30=Convolution2D(512, 3, 3, activation='relu')
l31=MaxPooling2D((2,2), strides=(2,2))

l32=Flatten()
l33=Dense(4096, activation='relu')
l34=Dropout(0.5)
l35=Dense(4096, activation='relu')
l36=Dropout(0.5)
l37=Dense(1000, activation='softmax')

model = Sequential([l1,l2,l3,l4,l5,l6,l7,l8,l9,l10,l11,l12,l13,l14,l15,l16,l17,l18,l19,l20,l21,l22,l23,l24,l25,l26,l27,l28,l29,l30,l31,l32,l33,l34,l35,l36,l37])

#model = Sequential([l1,l2,l3,l4,l5,l6,l7,l8,l9,l10])
#model = Sequential([l1,l2,l3,l4,l5,l6,l7,l8,l9,l10])
print("Now Printing the model summary \n")
print(model.summary())

Note that I did not make any changes in the dimensions, hyper parameter values given in the code. I just modified the code from its documentation point of view like naming different layers, adding comments etc.

Also, suggest ways to diagnose future errors of this type on my own.

The Error message is as follows:

runfile('/home/upendra/vgg_from_scratch', wdir='/home/upendra') Found 200 images belonging to 2 classes. Found 100 images belonging to 2 classes. Found 60 images belonging to 2 classes. /home/upendra/vgg_from_scratch:53: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(256, (3, 3), activation="relu")`   l12=Convolution2D(256, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:55: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(256, (3, 3), activation="relu")`   l14=Convolution2D(256, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:57: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(256, (3, 3), activation="relu")`   l16=Convolution2D(256, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:61: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(512, (3, 3), activation="relu")`   l19=Convolution2D(512, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:63: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(512, (3, 3), activation="relu")`   l21=Convolution2D(512, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:65: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(512, (3, 3), activation="relu")`   l23=Convolution2D(512, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:69: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(512, (3, 3), activation="relu")`   l26=Convolution2D(512, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:71: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(512, (3, 3), activation="relu")`   l28=Convolution2D(512, 3, 3, activation='relu') /home/upendra/vgg_from_scratch:73: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(512, (3, 3), activation="relu")`   l30=Convolution2D(512, 3, 3, activation='relu') Traceback (most recent call last):

  File "<ipython-input-4-56412ac381d0>", line 1, in <module>
    runfile('/home/upendra/vgg_from_scratch', wdir='/home/upendra')

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 668, in runfile
    execfile(filename, namespace)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/upendra/vgg_from_scratch", line 83, in <module>
    model = Sequential([l1,l2,l3,l4,l5,l6,l7,l8,l9,l10,l11,l12,l13,l14,l15,l16,l17,l18,l19,l20,l21,l22,l23,l24,l25,l26,l27,l28,l29,l30,l31,l32,l33,l34,l35,l36,l37])

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/keras/engine/sequential.py", line 92, in __init__
    self.add(layer)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/keras/engine/sequential.py", line 185, in add
    output_tensor = layer(self.outputs[0])

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py", line 457, in __call__
    output = self.call(inputs, **kwargs)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/keras/layers/pooling.py", line 157, in call
    data_format=self.data_format)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/keras/layers/pooling.py", line 220, in _pooling_function
    pool_mode='max')

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 3880, in pool2d
    data_format=tf_data_format)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 2153, in max_pool
    name=name)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 4640, in max_pool
    data_format=data_format, name=name)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
    op_def=op_def)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1756, in __init__
    control_input_ops)

  File "/home/upendra/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1592, in _create_c_op
    raise ValueError(str(e))

ValueError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_9/MaxPool' (op: 'MaxPool') with input shapes: [?,1,112,128].

I'm not sure where the error is, but I suggest you to use the most recent Keras API (e.g. get rid of warnings) and take inspiration from https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py — marco romelli, Aug 28 '18 at 10:31
Change your `Conv2D` calls according to the latest API and do not create the model like you did; rather use it like `model = Model(inputs=img, outputs=l37)`. — Anakin, Aug 28 '18 at 10:53

score 0 · Answer 1 · answered May 21 '19 at 03:12

I suspect your Conv2D definitions are wrong.

Where you have something like this:

Convolution2D(512, 3, 3, activation='relu')

I think you mean this:

Convolution2D(512, (3, 3), activation='relu')

You should probably avoid positional parameters to avoid confusion, your positional parameters imply this:

Convolution2D(filters=512, kernel_size=3, strides=3, activation='relu')

I don't remember VGG16 having a stride of (3, 3), which is what you've defined. Correct me if I'm wrong and I'll update this (I don't have the VGG architecture burned into my head).

Notice that your output shape just before max_pooling2d_9/MaxPool is [?,1,112,128], that should refer to this line l10=MaxPooling2D((2,2), strides=(2,2)) because l9 is the the only layer before a Max Pool that's outputing 128 features. But you should add name='a_useful_name' to all your layers to facilitate debugging. max_pooling2d_9/MaxPool is maddeningly hard to follow.

That shape [?,1,112,128] refers to:

? - unspecified batch dimension
1 - image height at layer l10 (that's the output l9 which we expect to be the same as the 3rd value, 112), so this is the problem child.
112 - image width at layer l10 (this is looking correct)
128 - the number of filters (aka channels) input to the max pool layer.

If I didn't hit the nail on the head, I hope I gave you enough insight into the model architecture and what to expect and where to look to help you track it down.

A good troubleshooting step is to create the model with l6 as the output layer, don't run fit, but run predict to check the output at that layer is the shape you expect. Repeat with l7, l8, etc. At some point pretty quickly you'll see an output shape that is unexpected.

training VGG16 from scratch

1 Answers1