2

I've trained up a model and saved it in a checkpoint, but only just realized that I forgot to name one of the variables I'd like to inspect when I restore the model.

I know how to retrieve named variables from tensorflow, (g = tf.get_default_graph() and then g.get_tensor_by_name([name])). In this case, I know its scope, but it is unnamed. I've tried looking in tf.GraphKeys.GLOBAL_VARIABLES, but it doesn't appear there, for some reason.

Here's how it's defined in the model:

with tf.name_scope("contrastive_loss") as scope:
    l2_dist = tf.cast(tf.sqrt(1e-4 + tf.reduce_sum(tf.subtract(pred_left, pred_right), 1)), tf.float32) # the variable I want

    # I use it here when calculating another named tensor, if that helps.
    con_loss = contrastive_loss(l2_dist) 
    loss = tf.reduce_sum(con_loss, name="loss")

Is there any way of finding the variable without a name?

Engineero
  • 12,340
  • 5
  • 53
  • 75
Jess
  • 1,515
  • 3
  • 23
  • 32
  • [This SO question might help](https://stackoverflow.com/questions/36533723/tensorflow-get-all-variables-in-scope). Basically you can specify scope with `tf.get_collection`. – Engineero Jun 19 '17 at 20:40
  • Hmm, I saw that -- the variable doesn't appear in `tf.GraphKeys.GLOBAL_VARIABLES`, and trying to further specify the scope returns an empty array. Now that I think about it, this is kind of odd, since I should have the named variable `loss` in the scope as well...not sure what's happening. – Jess Jun 19 '17 at 20:40
  • It seems like `tf.GraphKeys.GLOBAL_VARIABLES` only shows the variables defined with `tf.Variable`. – Jess Jun 19 '17 at 20:44
  • Oh, right, because your `l2_dist` is an operation, not a variable. Gotta go to a meeting but I'll try to help when I'm back if you haven't figured it out. Look into retrieving operations. – Engineero Jun 19 '17 at 20:54
  • Maybe it is a stupid comment, but it seems easier to me to get the training model again and trying restore the variable there. It should be assigned the same name as in the first run. Then copy to a named variable and save it again. I don't know surely, but, in theory, it should do the job. – Giuseppe Marra Jun 19 '17 at 22:30

2 Answers2

7

First of all, following up on my first comment, it makes sense that tf.get_collection given a name scope is not working. From the documentation, if you provide a scope, only variables or operations with assigned names will be returned. So that's out.

One thing you can try is to list the name of every node in your Graph with:

print([node.name for node in tf.get_default_graph().as_graph_def().node])

Or possibly, when restoring from a checkpoint:

saver = tf.train.import_meta_graph(/path/to/meta/graph)
sess = tf.Session()
saver.restore(sess, /path/to/checkpoints)
graph = sess.graph
print([node.name for node in graph.as_graph_def().node])

Another option is to display the graph using tensorboard or Jupyter Notebook and the show_graph command. There might be a built-in show_graph now, but that link is to a git repository where one is defined. You will then have to search for your operation in the graph and then probably retrieve it with:

my_op = tf.get_collection('full_operation_name')[0]

If you want to set it up in the future so that you can retrieve it by name, you need to add it to a collection using tf.add_to_collection:

my_op = tf.some_operation(stuff, name='my_op')
tf.add_to_collection('my_op_name', my_op)

Then retrieve it by restoring your graph and then using:

my_restored_op = tf.get_collection('my_op_name')[0]

You might also be able to get by just naming it and then specifying its scope in tf.get_collection instead, but I am not sure. More information and a helpful tutorial can be found here.

Autonomous
  • 8,935
  • 1
  • 38
  • 77
Engineero
  • 12,340
  • 5
  • 53
  • 75
  • Thanks, this is super helpful! As a follow-up to the distinction b/w retrieving operations vs. tensors -- if I did name the l2_dist cast operation, how would I retrieve it? Would it be graph.get_tensor_by_name("contrastive_loss/l2_dist:0") or graph.get_operation_by_name("contrastive_loss/l2_dist:0") or..? – Jess Jun 20 '17 at 14:23
  • As a test, I've tried naming the cast operation (`l2_dist = tf.cast(...., tf.float32, name="l2_dist"`) and trying to retrieve it, but I can't seem to find it in the list of operations or tensors. Is cast different in some way? – Jess Jun 20 '17 at 14:33
  • @Jess updated my answer to hopefully address your question. I struggled with this too recently; hope it helps! – Engineero Jun 20 '17 at 14:54
1

tf.get_collection does not work with unnamed variables. So list the operations with:

graph = sess.graph
print(graph.get_operations())

... find your tensor in the list and then:

global_step_tensor = graph.get_tensor_by_name('complete_operation_name:0')

And I found this tutorial very helpful to understand the mechanism behind these.

cojoc
  • 11
  • 3
  • The question is how to get a tensor without knowing name. `get_tensor_by_name` is mentioned in the question – Lex Jun 01 '18 at 23:28