0

I stumbled on some weird tensorflow behaviour. After tf.print everywhere, it led me to the cause as shown in the following code but don't know why it happened unless either threading race condition or graph construction omitted the code segment. Don't see either of them should happen.

# Ragged tensor may have empty rows. So, for tensor arithmetic operation, 
# we need to create zero-padded tensors to replace them.
# This implementation only keeps the first entry of each row.
# So, the output tensor is a normal tensor.
def pad_empty_ragged_tensor(ragtensor):
    tf.print("Ragged tensor padding empty tensor...", output_stream=sys.stdout)
    batch_size = ragtensor.shape[0]
    n_rows = ragtensor.row_lengths()
    tf.print("row_lengths(): ", n_rows, output_stream=sys.stdout)
    new_tensor = []
    for i in range(batch_size):
        tf.print("n_rows[i]: ", n_rows[i], output_stream=sys.stdout)
        if tf.equal(n_rows[i], 0): # Tried n_rows[i] == 0 too
            tf.print("Create zero padded tensor...", output_stream=sys.stdout)
            num_zeros = ragtensor.shape[-1]
            tensor = tf.tile([[0]], [1, num_zeros])
            tensor = tf.cast(tensor, dtype=ragtensor.dtype)
        else:
            tf.print("Take first entry from the row", output_stream=sys.stdout)
            tensor = ragtensor[i,0:1]
        new_tensor.append(tensor)
    tensor = tf.stack(new_tensor, axis=0) # [batch, 1, [y, x, h, w]]
    tensor.set_shape([batch_size, 1, ragtensor.shape[-1]])
    tf.print("The padded tensor shape: ", tensor.shape, output_stream=sys.stdout)
    return tensor

Here is a segment of the print trace:

row_lengths():  [1 1 0 ... 1 1 1]
n_rows[i]:  1
Take first entry from the row
n_rows[i]:  1
Take first entry from the row
n_rows[i]:  0
Take first entry from the row
n_rows[i]:  1
Take first entry from the row

As shown, if tf.equal(n_rows[i], 0): # Tried n_rows[i] == 0 too condition block was never called. It falls into 'else' condition every time even if the equality condition was met. Could anyone hint me what went wrong? BTW, debugging tensorflow runtime is difficult too. Breakpoint in VSCode didn't hit once graph execution runs. tfdbg is not working with eager execution either. A suggestion on this is very beneficial to me too.

My dev env:

  • OS: Ubuntu18.04
  • Python: 3.6
  • Tensorflow-gpu: 1.14
  • GPU: RTX2070
  • Cuda: 10.1
  • cudnn: 7.6
  • IDE: VS code
  • Tensorflow mode: Eager execution

Thanks in advance

Kriss
  • 21
  • 3
  • [`tf.equal()`](https://www.tensorflow.org/api_docs/python/tf/math/equal) is an element-wise operator. So you'd have to make both the inputs to the function of the same length and finally use `tf.reduce_all(tf.equal(a, b))` . If however, you do not want to do that, you can check for individual entries in the returned tensor. – learner Oct 22 '19 at 05:20
  • Thank you for your suggestion. I eventually worked out the problem. It was due to memory exhaustion. Once reducing batch size to 4, it works fine now. – Kriss Oct 23 '19 at 21:27

0 Answers0