I've developed a model that requires me to have two versions of a model, one before the training step and one after. I thought I could simply do this using a tf.assign() method call but it seems that this has massively slowed down the training.
Why does tf.assign() slow the execution time?
This post asks a similar question however the author is simply trying to update the learning rate and can do so by just adding a feed_dict. However, calling tf.assign can't really be avoided in my case? The other solution involved separating the graph definition and the graph run but since I require both to be in the session since they need access to the parameters of other models I'm unsure how to do this.
Any help is appreciated.
Code is as simple as:
tf.assign(var[0], var[2])
tf.assign(var[1], var[3])
Q_agent.train(...)
and var[0]
and var[1]
are the params of the Q_agent.
The training time is quite long in this case. I've adapted the code to try and use a tf.placeholder. The code is as follows:
var = tf.trainable_variables()
params = [var[4], var[5]]
update_hidden = tf.placeholder(params[0].dtype, shape=params[0].get_shape())
update_value = tf.placeholder(params[1].dtype, shape=params[1].get_shape())
for loop:
var = tf.trainable_variables()
old_hidden = var[0]
old_value = var[1]
new_hidden = var[2]
new_value = var[3]
update_h = old_hidden.assign(update_hidden)
update_v = old_value.assign(update_value)
sess.run([update_h, update_v], feed_dict={update_hidden: new_hidden.eval(), update_value: new_value.eval()})
Though the train function now runs quickly, this hasn't improved the efficiency of the code because there's continuous slow down in performance when running update_h
and update_v
. Any ideas?