1

I have a model, made up of a CNN, RNN, and Output layer. The data I have is a image and the transcription of it. The transcription is padded to a length of 9 characters. For the CTC loss I followed the keras ocr example code so its like this:

class CTCLayer(layers.Layer):
    def __init__(self, name=None):
        super().__init__(name=name)
        self.loss_fn = keras.backend.ctc_batch_cost

def call(self, y_true, y_pred):
    batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
    input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
    label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")

    input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
    label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")

    loss = self.loss_fn(y_true, y_pred, input_length, label_length)
    self.add_loss(loss)

    return y_pred

Now here is how I immplimented it:

#l is the number possible of classes / characters
labels = layers.Input(shape=(9,), dtype="float32")
outputs = layers.Dense(l+1, activation='softmax',name='output')(lstm)

output = CTCLayer()(labels,outputs)

model = Model(inputs = [input_layer,labels],outputs=output)
model = model.compile(optimizer = optimizers.Adam(0.01))
model.fit([x_train,y_train],y_train,validation_split = 0.2, epochs = 100)

Once running model.fit something weird started happening, I got a inf training loss but a validation loss which was at around 20. I looked at what may be causing it and came across this post. The accepted answer stated the following:

It's definitely the sequence length of the input that causes the problem. Apparently, the sequence length should be a bit greater than the ground truth length.

What does this mean and how would I need to change my code in order to solve the issue I am having?

0 Answers0