Another possibility is to accumulate the summary over the test batches outside of tensorflow and have a dummy variable in the graph to which you can then assign the result of the accumulation. As an example: say you compute the loss on the validation set over several batches and want to have a summary of the mean. You could do achieve this in the following way:
with tf.name_scope('valid_loss'):
v_loss = tf.Variable(tf.constant(0.0), trainable=False)
self.v_loss_pl = tf.placeholder(tf.float32, shape=[], name='v_loss_pl')
self.update_v_loss = tf.assign(v_loss, self.v_loss_pl, name='update_v_loss')
with tf.name_scope('valid_summaries'):
v_loss_s = tf.summary.scalar('validation_loss', v_loss)
self.valid_summaries = tf.summary.merge([v_loss_s], name='valid_summaries')
Then at evaluation time:
total_loss = 0.0
for batch in all_batches:
loss, _ = sess.run([get_loss, ...], feed_dict={...})
total_loss += loss
total_loss /= float(n_batches)
[_, v_summary_str] = sess.run([self.update_v_loss, self.valid_summaries],
feed_dict={self.v_loss_pl: total_loss})
writer.add_summary(v_summary_str)
While this gets the job done, it admittedly feels a bit hacky. That streaming metric evaluation from contrib you posted might be much more elegant - I've never come across it actually, so curious to check it out.