34

Using Keras from Tensorflow 1.4.1, how does one copy weights from one model to another?

As some background, I'm trying to implement a deep-q network (DQN) for Atari games following the DQN publication by DeepMind. My understanding is that the implementation uses two networks, Q and Q'. The weights of Q are trained using gradient descent, and then the weights are copied periodically to Q'.

Here's how I build Q and Q':

ACT_SIZE   = 4
LEARN_RATE = 0.0025
OBS_SIZE   = 128

def buildModel():
  model = tf.keras.models.Sequential()

  model.add(tf.keras.layers.Lambda(lambda x: x / 255.0, input_shape=OBS_SIZE))
  model.add(tf.keras.layers.Dense(128, activation="relu"))
  model.add(tf.keras.layers.Dense(128, activation="relu"))
  model.add(tf.keras.layers.Dense(ACT_SIZE, activation="linear"))
  opt = tf.keras.optimizers.RMSprop(lr=LEARN_RATE)

  model.compile(loss="mean_squared_error", optimizer=opt)

  return model

I call that twice to get Q and Q'.

I have an updateTargetModel method below that is my attempt at copying weights. The code runs fine, but my overall DQN implementation is failing. I'm really just trying to verify if this is a valid way of copying weights from one network to another.

def updateTargetModel(model, targetModel):
  modelWeights       = model.trainable_weights
  targetModelWeights = targetModel.trainable_weights

  for i in range(len(targetModelWeights)):
    targetModelWeights[i].assign(modelWeights[i])

There's another question here that discusses saving and loading weights to and from disk (Tensorflow Copy Weights Issue), but there's no accepted answer. There is also a question about loading weights from individual layers (Copying weights from one Conv2D layer to another), but I'm wanting to copy the entire model's weights.

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
benbotto
  • 2,291
  • 1
  • 20
  • 32
  • Keras FAQ covers saving and loading model weights. You can save/load all weights or you can go by layer as well: https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model – Manngo Jan 31 '18 at 17:57
  • Thank you Manngo. I have reviewed saving and loading models, and mentioned as much at the end of my question via the first inked question. My question, however, is regarding copying weights directly from one model to another without the intermediary file. – benbotto Jan 31 '18 at 19:14

1 Answers1

79

Actually what you've done is much more than simply copying weights. You made these two models identical all the time. Every time you update one model - the second one is also updated - as both models have the same weights variables.

If you want to just copy weights - the simplest way is by this command:

target_model.set_weights(model.get_weights()) 
mrgloom
  • 20,061
  • 36
  • 171
  • 301
Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
  • Thanks Marcin, that's what I was worried about. So the way I was doing it the weights in Q' basically reference those in Q? Do I need to recompile the target model or anything else after copying using your method? – benbotto Jan 31 '18 at 22:51
  • Your welcome :) I'm glad I could help. Good luck with your project. – Marcin Możejko Jan 31 '18 at 22:57
  • 4
    What is the difference with `clone_model`? https://github.com/keras-team/keras/issues/1765#issuecomment-324018225 – mrgloom May 22 '18 at 21:05
  • 16
    Just want to remind anyone reading this thread: the clone_model function, unlike what its name suggests, does NOT copy the weight – DiveIntoML Feb 27 '20 at 22:43
  • 1
    @mrgloom, the referenced function creates a fresh model with the same architecture. This includes creating **fresh** weights. – emil Dec 15 '21 at 14:32