What are the practical differences between the variable_scope and name_scope?

Question

Although I have gone through the pages regarding the same question: What is the difference between variable_scope and name_scope? and What is the difference between variable_ops_scope and variable_scope?.

I still cannot fully understand their differences. I have tried to use tf.variable_scope and tf.name_scope for the same code, I found they have the same graph by TensorBoard.

Other people have discussed their main differences with the generated name in the Graph, while is their name so important? I also saw that the variable with the same name would be reused. What is the reuse occasion?

score 0 · Accepted Answer · answered Aug 24 '17 at 00:00

The key is to understand the difference between variables and other tensors in the graph. Any newly created tensors will gain a prefix from a name scope. tf.get_variable will look for existing variables without the name scope modifier. Newly created variables with tf.get_variable will still get their name's augmented.

The script below highlights these differences. The intention is to reproduce the simple function by refactoring the tf.matmul(x, A) + b line and variable creation into a separate function add_layer.

import tensorflow as tf


def get_x():
    return tf.constant([[1., 2., 3.]], dtype=tf.float32)


def report(out1, out2):
    print(out1.name)
    print(out2.name)
    variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
    print([v.name for v in variables])


def simple():
    A = tf.get_variable(shape=(3, 3), dtype=tf.float32, name='A')
    b = tf.get_variable(shape=(3,), dtype=tf.float32, name='b')
    x = get_x()
    out1 = tf.matmul(x, A) + b
    out2 = tf.matmul(out1, A) + b
    return out1, out2


def add_layer(x):
    A = tf.get_variable(shape=(3, 3), dtype=tf.float32, name='A')
    b = tf.get_variable(shape=(3,), dtype=tf.float32, name='b')
    return tf.matmul(x, A) + b


def no_scoping():
    x = get_x()
    out1 = add_layer(x)
    out2 = add_layer(out1)
    return out1, out2


def different_name_scopes():
    x = get_x()
    with tf.name_scope('first_layer'):
        out1 = add_layer(x)
    with tf.name_scope('second_layer'):
        out2 = add_layer(out1)
    return out1, out2


def same_name_scope():
    x = get_x()
    with tf.name_scope('first_layer'):
        out1 = add_layer(x)
    with tf.name_scope('first_layer'):
        out2 = add_layer(out1)
    return out1, out2


def different_variable_scopes():
    x = get_x()
    with tf.variable_scope('first_layer'):
        out1 = add_layer(x)
    with tf.variable_scope('second_layer'):
        out2 = add_layer(out1)
    return out1, out2


def same_variable_scope():
    x = get_x()
    with tf.variable_scope('first_layer'):
        out1 = add_layer(x)
    with tf.variable_scope('first_layer'):
        out2 = add_layer(out1)
    return out1, out2


def same_variable_scope_reuse():
    x = get_x()
    with tf.variable_scope('first_layer'):
        out1 = add_layer(x)
    with tf.variable_scope('first_layer', reuse=True):
        out2 = add_layer(out1)
    return out1, out2


def test_fn(fn, name):
    graph = tf.Graph()
    with graph.as_default():
        try:
            print('****************')
            print(name)
            print('****************')
            out1, out2 = fn()
            report(out1, out2)
            print('----------------')
            print('SUCCESS')
            print('----------------')
        except Exception:
            print('----------------')
            print('FAILED')
            print('----------------')


for fn, name in [
        [simple, 'simple'],
        [no_scoping, 'no_scoping'],
        [different_name_scopes, 'different_name_scopes'],
        [same_name_scope, 'same_name_scope'],
        [different_variable_scopes, 'different_variable_scopes'],
        [same_variable_scope, 'same_variable_scope'],
        [same_variable_scope_reuse, 'same_variable_scope_reuse']
        ]:
    test_fn(fn, name)

Results:

****************
simple
****************
add:0
add_1:0
[u'A:0', u'b:0']
----------------
SUCCESS
----------------
****************
no_scoping
****************
----------------
FAILED
----------------
****************
different_name_scopes
****************
----------------
FAILED
----------------
****************
same_name_scope
****************
----------------
FAILED
----------------
****************
different_variable_scopes
****************
first_layer/add:0
second_layer/add:0
[u'first_layer/A:0', u'first_layer/b:0', u'second_layer/A:0', u'second_layer/b:0']
----------------
SUCCESS
----------------
****************
same_variable_scope
****************
----------------
FAILED
----------------
****************
same_variable_scope_reuse
****************
first_layer/add:0
first_layer_1/add:0
[u'first_layer/A:0', u'first_layer/b:0']
----------------
SUCCESS
----------------

Note that using different variable_scopes without reuse doesn't raise an error, but creates multiple copies of A and b, which may not be intended.

Thank you very much. Based on my understanding, if I cannot reuse these variables, and only want to get a compact graph visualization by Tensorboard, I don't care much about this concern. — Kevin Sun, Aug 24 '17 at 00:34
If you're not introducing variables (as opposed to other operations, such as matmuls) it won't make a difference, though I'd suggest using name_scope since it's simpler under-the-hood. — DomJack, Aug 24 '17 at 06:54
Got it, I would sue the name scope in my codes. Thank you very much. — Kevin Sun, Aug 24 '17 at 06:57

What are the practical differences between the variable_scope and name_scope?

1 Answers1

Linked