3

When running my neural network (Bidirectional LSTMs) for audio recognition i am using Conectionist Temporal Classification (CTC). But at some point, training the Network i get following warning from Tensorflow nearly every batch.

W tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found.

This results in the loss being infinite and a broken training.

Epoch 1/5000, train_cost = inf, train_ler = 3.891, valid_cost = inf, valid_ler = 5.433
Train decoding: 
Original: you want to go over and see his gang throw dirt
Decoded: hlvoyvrofuvulovowovwvoxvovoxwyzlwkngitewewewktwlwctnbkmpajxozovofovfvfwnsfvfnfavtieitstyvubeabmbmbjljaceutztqectpmgogovgvovjuvsvsihskikqlvnsmsmsmhmvwiecececeitmhmhfvrf tiet e gekesketksmvamnmamgmnm det ietutswsezvzovovjiecgs gs smsjs s g la ah kjrkmasanxrsdrhdrxgdhdaphxda th sxhsxrdsrsvr krs farsr rdrdakr lrsrsvrsrsrdrsrsraisdsrhrhdrajfrdhxrd d

What exactly does this mean and how can i resolve this issue ?

Florian Braun
  • 973
  • 9
  • 16
  • 1
    Did you try to clip your gradient? (https://groups.google.com/a/tensorflow.org/forum/#!msg/discuss/r1uSwRo82A0/4tfxkGKZCQAJ) – rAyyy May 30 '17 at 04:44
  • I Tried a verry low learning rate (0.005) which reduced the likelyhood that this appears :/ , but first learning is verry slow and second after some time this error may suddenly appear making further training useless . – Florian Braun Jun 07 '17 at 14:15

1 Answers1

-1

I have been working with the same issue and found this article. It suggests that the probable reasons for this issue may be the following:

  1. Sequence of data is shorter than the sequence of labels.
  2. Sequence of data is too long.

I have been trying to train a CNN+RNN model for OCR using the ctc loss function and was faced with the same issue. I tried reducing the learning rate but the issue was still there.

My dataset had a little over 70,000 data instances when I first attempted training the model and it worked for me when I reduced the number of data instances used in the training process.

Not sure this is the best option but its understandable that the loss function exploding with sequences that are too long. This might be the reason if your data sequence is longer than the label sequence and it still throws the No valid path found. warning.