CTC Loss bug: no valid path found? OCR difficulties in Tf.keras

Question

all. I'm trying to get the CTC loss function here and it's not working very well. I keep getting this bug:

2020-11-04 07:28:53.647946: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.647977: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648009: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.647992: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648021: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648063: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648052: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648074: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648080: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2020-11-04 07:28:53.648308: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.

I have scoured the internet for information on this and I'm not getting anything.

Here is code for it:

    def loss_fn(self, y_true, y_pred):

        batch_len = tf.keras.backend.cast(tf.shape(y_true)[0], dtype="int64")
        input_length = tf.keras.backend.cast(tf.shape(y_pred)[1], dtype="int64") #Comes out to be 30
        label_length = tf.keras.backend.cast(tf.shape(y_true)[1], dtype="int64") #Comes out to be 25

        input_length = 30 * tf.ones(shape=(batch_len, 1), dtype="int64") #Just hardcoded 30 for now
        label_length = 25 * tf.ones(shape=(batch_len, 1), dtype="int64") #Just hardcoded 25 for now



        y_true = tf.keras.layers.Softmax()(y_true)
        y_pred = tf.keras.layers.Softmax()(y_pred)

        print("y_true shape %s" %y_true.shape) #Outputs y_true shape (32, 25)
        print(y_true) #outputs Tensor("loss_fn/softmax/Softmax:0", shape=(32, 25), dtype=float32)

        print("y_pred shape %s" %y_pred.shape) #Outputs y_pred shape (32, 30, 67)
        print(y_pred) #outputs Tensor("loss_fn/softmax_1/Softmax:0", shape=(32, 30, 67), dtype=float32)

        loss = tf.keras.backend.ctc_batch_cost(y_true, y_pred, input_length, label_length)
        return tf.reduce_mean(loss)

The loss function is being called here:

...

    def ResNet:
        ...

        out = tf.keras.layers.Reshape((out.shape[2], out.shape[3]))(out)
        print("out %s" %out.shape) #Comes out to be: out (None, 30, 768)

        weight_initializer = tf.keras.initializers.he_uniform()
        bias_initializer = tf.keras.initializers.constant()

        logits = tf.keras.layers.Dense(67, kernel_initializer=weight_initializer, bias_initializer=bias_initializer, name="logits")(out)
        print("logits %s" %logits.shape) #Comes out to be: logits (None, 30, 67)

        print("________________________")
        print(logits)

        model = tf.keras.Model(inputs=[input, labels], outputs=logits, name="full_model")
        model.compile(optimizer="RMSprop", loss=self.loss_fn)
        print(model.summary())

Main function that calls this:

...
...
    d = dataset.Dataset(confs)
    train_data = d.read_data(confs["trn_data_files"])
    valid_data = d.read_data(confs["val_data_files"])
    callbacks = [
        tf.keras.callbacks.ModelCheckpoint("./model_checkpoint", monitor="val_loss")
    ]

    for x,y in train_data:
        history = model.fit(
            x=x,
            y=y,
            validation_data=valid_data,
            epochs=50,
            callbacks=callbacks,
        )

Dataset has preprocessing.

As you can see, the dimensions of the label is smaller than the logits. I know that if this is not the case, the "no valid path found" error happens.

Am I doing something wrong? Please help. Thank you so much in advance.

score 0 · Accepted Answer · answered Nov 04 '20 at 08:07

0

In CTC, you need to have more hidden states than target labels. CTC in fact learns how to effeciently interlieve the target labels with special "blank" symbols, so the labels best match the hidden states. However, when you have more target label than hidden states, there is no way how you can align them.

In CNN, you probably reduce the dimensionality of the input too much, the hidden state sequence is too short. You should eihter reconsider how you do padding and pooling in the CNN or (and it is probably the worse idea), do some state splitting projection as when CTC is used for machine translation.

answered Nov 04 '20 at 08:07

Jindřich

10,270
2
23
44

Hi, thank you for your quick answer! Quick follow-up question: 1. This code worked just fine in tensorflow 1.1, without keras. I am just translating it from 1.1 to 2.x. Would that play a factor? I know that the ctc loss function had some changes (like no longer doing softmax automatically) when it went to tf 2.x. Thanks! – David Bradford Nov 04 '20 at 08:23
It might be. Originally TF had its own CPU-only implementation of CTC. Now, CTC is part of the cuDNN library and new TF uses that. – Jindřich Nov 04 '20 at 08:39
Could this be due to an improper installation of cuDNN? – David Bradford Nov 04 '20 at 08:46
I don't know. I would rather guess that it originally silently ignored the invalid samples and now it yields the warnings. – Jindřich Nov 04 '20 at 08:57
This is happening for literally every single element in the dataset. I don't know if that helps, but I'm guessing it's some critical issue. – David Bradford Nov 04 '20 at 09:11

CTC Loss bug: no valid path found? OCR difficulties in Tf.keras

1 Answers1

Linked