Training a model with Resnet152, saving the weights, loading them and adding more layers issue

Question

My goal is to first train only using ResNet152 and then save the learned weights. Then i want to use these weights as a base for a more complex model with added layers which i ultimately want to do hyperparameter tuning on. The reason for this approach is that doing it all at once takes a very long time. The problem i am having is that my code doesnt seem to work. I dont get an error message but when i start training the more complex model it seems to start from 0 again and not using the learned ResNet152 weights.

Here is the code:

First i am only using ResNet152 and the output layer

input_tensor = Input(shape=train_generator.image_shape)

base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)

for layer in base_model.layers[:]:
   layer.trainable = True¨

x = Flatten()(base_model.output)
  
predictions = Dense(num_classes, activation= 'softmax')(x)

model = Model(inputs = base_model.input, outputs = predictions)

model.compile(
  loss='sparse_categorical_crossentropy',
  optimizer=opt,
  metrics=['accuracy'])

model.fit(
  train_generator,
  validation_data=valid_generator,
  epochs=epochs,
  steps_per_epoch=len_train // batch_size,
  validation_steps=len_val // batch_size,
  callbacks=[earlyStopping, reduce_lr]
)

Then i am saving the weights:

model.save_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5')

Adding more layers.


input_tensor = Input(shape=train_generator.image_shape)

base_model = applications.ResNet152(weights='imagenet', include_top=False, input_tensor=input_tensor)

for layer in base_model.layers[:]:
    layer.trainable = False 

x = Flatten()(base_model.output)
 
x = Dense(1024, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01), 
          kernel_initializer=tf.keras.initializers.HeNormal(), 
          kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
x = Dropout(rate=0.1)(x)
x = Dense(512, kernel_regularizer=tf.keras.regularizers.L2(l2=0.01), 
          kernel_initializer=tf.keras.initializers.HeNormal(), 
          kernel_constraint=tf.keras.constraints.UnitNorm(axis=0))(x)
x = LeakyReLU()(x)
x = BatchNormalization()(x)
  
predictions = Dense(num_classes, activation= 'softmax')(x)
model = Model(inputs = base_model.input, outputs = predictions)

Loading the weights after the added layers and using by_name=True, both according to the keras tutorial.

model.load_weights('/content/drive/MyDrive/MODELS_SAVED/model_RESNET152/model_weights.h5', by_name=True)

Then i start training again.

model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=opt,
    metrics=['accuracy']
    )

model.fit(
  train_generator,
  validation_data=valid_generator,
  epochs=epochs,
  steps_per_epoch=len_train // batch_size,
  validation_steps=len_val  // batch_size,
  callbacks=[earlyStopping, reduce_lr]
)

But it is starting at a very low accuracy, basically from 0 again, so im guessing something is wrong here. Any ideas on how to fix this?

@BDouchet Im trying your idea right now, will have results in about an hour because im running Resnet for a few more epochs to be sure if it works or not — JKnecht, Dec 12 '20 at 09:51

Andrey · Accepted Answer · 2020-12-12T14:46:15.563

1

When you use adam and save model weights only - you have to save/load optimizer weights as well:

  weight_values = model.optimizer.get_weights()
  with open(output_path+'optimizer.pkl', 'wb') as f:
      pickle.dump(weight_values, f)

  dummy_input = tf.random.uniform(inp_shape) # create a tensor of input shape
  dummy_label = tf.random.uniform(label_shape) # create a tensor of label shape
  hist = model.fit(dummy_input, dummy_label)
  with open(path_to_saved_model+'optimizer.pkl', 'rb') as f:
      weight_values = pickle.load(f)
  optimizer.set_weights(weight_values)

edited Dec 12 '20 at 14:46

answered Dec 12 '20 at 10:12

Andrey

5,932
3
17
35

Im pretty new to this. With this code i save the weights, right? So i put this code right after model.save_weights()?. What is the code for loading the optimizer weights from this pickle dump? I have never used pickle. – JKnecht Dec 12 '20 at 10:25
Is this right? ``` with open("weight_values", "rb") as f: weights = Pickle.load(f) Optimizer.set_weights(weights) ``` – JKnecht Dec 12 '20 at 10:32
Thanks, i think it works now!! First i trained resnet152 for 3 epochs and got loss: 2.4604 - accuracy: 0.4477 ...then i saved and built the new model and it started at Epoch 1/3 114/3969 [..] - ETA: 27:02 - loss: 19.4842 - accuracy: 0.2174...where it earlier started at accuracy 0. This new start is reasonable, right? If anyone else runs into this problem i want to add that you also need to follow BDouchets advice for it to work, i.e. set base_model.load_weights(your path) instead of model.load_weights. – JKnecht Dec 12 '20 at 11:41
@JKnecht accuracy should be 0.44 (unless your accuracy is too volatile). I updated the answer - try to fit with dummy input before loading weights – Andrey Dec 12 '20 at 12:38
Ok, but was is a dummy input? I tried googling it and didnt find much but it should be some "meaningless", random input of the same shape as the 224*224 images i am feeding my model? Can you help me with the code for the dummy input also? I dont know if it matters but i am using a train_generator = _train_generator.flow_from_dataframe(dataframe = train_info, directory = train_folder, x_col = "filename", y_col = "artist", seed = 42, batch_size = 16, shuffle = True, class_mode="sparse", target_size = (224,224)) – JKnecht Dec 12 '20 at 13:11
@JKnecht see my edits. Tie input_shape, label_shape to your model. I don't use generators, so I can not help here – Andrey Dec 12 '20 at 14:49

Training a model with Resnet152, saving the weights, loading them and adding more layers issue

1 Answers1