0

everyone. I am using tensorflow 1.4 to train a model like U-net for my purpose. Due to the constraints of my hardware, when training, the batch_size could only set to be 1 otherwise there will be OOM error.

Here comes my question. In this case, when the batch_size equals to 1, will the tf.layers.batch_normalization() works correctly(saying moving average, moving variance, gamma, beta)? will small batch_size makes it working unstable?

In my work, I set training=True when training, and training=False when testing. When training, I use

logits = mymodel.inference()
loss = tf.mean_square_error(labels, logits)
updata_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
    train_op = optimizer.minimize(loss)
...

saver = tf.train.Saver(tf.global_variables())
with tf.Session() as sess:
    sess.run(tf.group(tf.global_variables_initializer(), 
                      tf.local_variables_initializer()))
    sess.run(train_op)
    ...
    saver.save(sess, save_path, global_step)

when testing, I use:

logits = model.inference()
saver = tf.train.Saver()

with tf.Session() as sess:
    saver.restore(sess, checkpoint)
    sess.run(tf.local_variables_initializer())
    results = sess.run(logits)

Could anyone tell me that am I using this wrong? And how much influence with batch_size equals to 1 in tf.layers.batch_normalization()?

Any help will be appreciated! Thanks in advance.

Neeraj Sewani
  • 3,952
  • 6
  • 38
  • 55
CQ is not hot
  • 51
  • 1
  • 6

1 Answers1

0

Yes, tf.layers.batch_normalization() works with batches of single elements. Doing batch normalization on such batches is actually named instance normalization (i.e. normalization of a single instance).

@Maxim made a great post about instance normalization if you want to know more. You can also find more theory on the web and in the literature, e.g. Instance Normalization: The Missing Ingredient for Fast Stylization.

benjaminplanche
  • 14,689
  • 5
  • 57
  • 69