5

I use the VGG-16 Net by keras. This is the detail

my problem is how to use this net to fine-tuning, and must I use the image size which is 224*224 for this net? And I must use 1000 classes when I use this net? if I don't use 1000 classes, it cause the error

Exception: Layer shape (4096L, 10L) not compatible with weight shape (4096, 1000).

Asking for help, thank you!

stop-cran
  • 4,229
  • 2
  • 30
  • 47
sky
  • 103
  • 1
  • 7

1 Answers1

6

I posted a detailed answer in this issue if you want to take a look. The following snippet will help you with the dimension of your last layer:

from keras.models import Sequential, Graph
from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D
import keras.backend as K

img_width, img_height = 128, 128

# build the VGG16 network with our input_img as input
first_layer = ZeroPadding2D((1, 1), input_shape=(3, img_width, img_height))

model = Sequential()
model.add(first_layer)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3'))
model.add(MaxPooling2D((2, 2), strides=(2, 2)))

# get the symbolic outputs of each "key" layer (we gave them unique names).
layer_dict = dict([(layer.name, layer) for layer in model.layers])

# load the weights

import h5py

weights_path = 'vgg16_weights.h5'

f = h5py.File(weights_path)
for k in range(f.attrs['nb_layers']):
    if k >= len(model.layers):
        # we don't look at the last (fully-connected) layers in the savefile
        break
    g = f['layer_{}'.format(k)]
    weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
    model.layers[k].set_weights(weights)
f.close()
print('Model loaded.')

# Here is what you want:

graph_m = Graph()
graph_m.add_input('my_inp', input_shape=(3, img_width, img_height))
graph_m.add_node(model, name='your_model', input='my_inp')
graph_m.add_node(Flatten(), name='Flatten', input='your_model')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense1',      input='Flatten')
graph_m.add_node(Dropout(0.5), name='Dropout1', input='Dense1')
graph_m.add_node(Dense(4096, activation='relu'), name='Dense2',  input='Dropout1')
graph_m.add_node(Dropout(0.5), name='Dropout2', input='Dense2')
graph_m.add_node(Dense(10, activation='softmax'), name='Final', input='Dropout2')
graph_m.add_output(name='out1', input='Final')
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
graph_m.compile(optimizer=sgd, loss={'out1': 'categorical_crossentropy'})

Note that you could freeze the training of the feature extraction layers and only fine tune the last fully connected layers. From the doc, you just have to add trainable = False to freeze the training of a layer. Ex freezed:

...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1', trainable=False))
...

Ex trainable:

...
model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1',     trainable=True))
...

trainable is True by default so that something happens if you don't know about the feature...

  • Thanks very much!!! In your code, you means I just need edit the final layer for my classes number? and if all the layer I set trainable=False means the weight default in the h5 file can't be updated? – sky Apr 12 '16 at 00:02
  • Exactly, and it will be much faster to train the network on your problem. – Thomas W. Boquet Apr 12 '16 at 00:09
  • Note that you could use a second sequential model instead of the graph model. – Thomas W. Boquet Apr 12 '16 at 00:10
  • Thanks to get your reply!!! I am new to use keras,now I use your code and I meet some problem,can you help me?
    ` File "E:/DL/vgg-16.py", line 125, in graph_m.fit(train_data,train_label,nb_epoch=10,shuffle=True,verbose=1,validation_split=0.2) File "D:\Anaconda2\lib\site-packages\keras\models.py", line 1184, in fit X = [data[name] for name in self.input_order] IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices`
    – sky Apr 12 '16 at 04:46
  • when I use your code, I meet this problem ` File "E:/DL/vgg-16.py", line 125, in graph_m.fit(train_data,train_label,nb_epoch=10,shuffle=True,verbose=1,validation‌​_split=0.2) File "D:\Anaconda2\lib\site-packages\keras\models.py", line 1184, in fit X = [data[name] for name in self.input_order] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices` can you tell me how to solve it? Thanks! – sky Apr 13 '16 at 02:02
  • You have to pass a dictionnary of inputs to the `Graph` model. You can easily replace the `Graph` model by a `Sequential` model. You just have to replace the `add_node` method by `add`, remove the names and the inputs and remove the `add_input` and the `add_output`. Feel free to read [Keras documentation](http://keras.io/) to have more details about what you could do to chain models. – Thomas W. Boquet Apr 14 '16 at 14:23
  • @ThomasW.Boquet Can you please take a look at https://stackoverflow.com/questions/62903492/modify-some-values-in-the-weight-file-h5-of-vgg-16 – noobcoder Jul 14 '20 at 22:01