8

Training a model with tf.nn.ctc_loss produces an error every time the train op is run:

tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found.

Unlike in previous questions about this function, this is not due to divergence. I have a low learning rate, and the error occurs on even the first train op.

The model is a CNN -> LSTM -> CTC. Here is the model creation code:

# Build Graph
self.videoInput = tf.placeholder(shape=(None, self.maxVidLen, 50, 100, 3), dtype=tf.float32)
self.videoLengths = tf.placeholder(shape=(None), dtype=tf.int32)
self.keep_prob = tf.placeholder(dtype=tf.float32)
self.targets = tf.sparse_placeholder(tf.int32)
self.targetLengths = tf.placeholder(shape=(None), dtype=tf.int32)

conv1 = tf.layers.conv3d(self.videoInput ...)
pool1 = tf.layers.max_pooling3d(conv1 ...)
conv2 = ...
pool2 = ...
conv3 = ...
pool3 = ...

cnn_out = tf.reshape(pool3, shape=(-1, self.maxVidLength, 4*7*96))

fw_cell = tf.nn.rnn_cell.MultiRNNCell(self.cell(), for _ in range(3))
bw_cell = tf.nn.rnn_cell.MultiRNNCell(self.cell(), for _ in range(3))
outputs, _ = tf.nn.bidirectional_dynamic_rnn(
            fw_cell, bw_cell, cnn_out, sequence_length=self.videoLengths, dtype=tf.float32)

outputs = tf.concat(outputs, 2)
outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])

w = tf.Variable(tf.random_normal((self.hidden_size * 2, len(self.char2index) + 1), stddev=0.2))
b = tf.Variable(tf.zeros(len(self.char2index) + 1))

out = tf.matmul(outputs, w) + b
out = tf.reshape(out, [-1, self.maxVidLen, len(self.char2index) + 1])
out = tf.transpose(out, [1, 0, 2])

cost = tf.reduce_mean(tf.nn.ctc_loss(self.targets, out, self.targetLengths))
self.train_op = tf.train.AdamOptimizer(0.0001).minimize(cost)

And here is the feed dict creation code:

indices = []
values = []
shape = [len(vids) * 2, self.maxLabelLen]
vidInput = np.zeros((len(vids) * 2, self.maxVidLen, 50, 100, 3), dtype=np.float32)

# Actual video, then left-right flip
for j in range(len(vids) * 2):

    # K is video index
    k = j if j < len(vids) else j - len(vids)

    # convert video and label to input format
    vidInput[j, 0:len(vids[k])] = vids[k] if k == j else vids[k][:,::-1,:]
    indices.extend([j, i] for i in range(len(labelList[k])))
    values.extend(self.char2index[c] for c in labelList[k])

fd[self.targets] = (indices, values, shape)
fd[self.videoInput] = vidInput

# Collect video lengths and label lengths
vidLengths = [len(j) for j in vids] + [len(j) for j in vids]
labelLens = [len(l) for l in labelList] + [len(l) for l in labelList]
fd[self.videoLengths] = vidLengths
fd[self.targetLengths] = labelLens
pwp2
  • 451
  • 3
  • 12

4 Answers4

13

It turns out that the ctc_loss requires that the label lengths be shorter than the input lengths. If the label lengths are too long, the loss calculator cannot unroll completely and therefore cannot compute the loss.

For example, the label BIFI would require input length of at least 4 while the label BIIF would require input length of at least 5 due to a blank being inserted between the repeated symbols.

user3733083
  • 419
  • 4
  • 6
pwp2
  • 451
  • 3
  • 12
  • 1
    any advice as to how you can debug this ? It seems those would be wrong examples, but it's difficult to identify them when training in batches – Ciprian Tomoiagă Jan 12 '18 at 09:12
  • @CiprianTomoiagă: that's pretty simple - you know the length (T) of the RNN output sequence. And you know your ground truth texts for which you calculate the length (L) and the number of repeated characters (R). Now you only have to check if L+R<=T. If not, CTC can't compute a loss value for this text and will throw the mentioned warning. Example: T=2, txt1="ab", txt2="aa". L1=2, R1=0 -> L1+R1<=T -> ok. L2=2, R2=1 -> L2+R2>T -> not ok. – Harry Sep 11 '18 at 13:44
4

I had the same issue but I soon realized it was just because I was using glob and my label was in the filename so it was exceeding.

You can fix this issue by using:

os.path.join(*(filename.split(os.path.sep)[noOfDir:]))
Andrew Fan
  • 1,313
  • 5
  • 17
  • 29
2

For me the problem was fixed by setting preprocess_collapse_repeated=True.
FWIW: My target sequence length was already shorter than inputs, and the RNN outputs are that of softmax.

Zining Zhu
  • 313
  • 1
  • 2
  • 12
1

Another possible reason which I found out in my case is the input data range is not normalized to 0~1, due to that LSTM activation function becomes saturated in the beginning of the training, and causes "no valid path" log somehow.

TingQian LI
  • 660
  • 8
  • 13