Why does CTC loss = infinity when the input size is not two times greater than output size?

Asked Mar 23 '23 at 18:53

Active Mar 23 '23 at 19:13

Viewed 78 times

I am implementing a handwriting recognition model and using CTC with LSTMs. I saw a discussion on GitHub saying that the input size must be at least 2n-1 where n is the output size. I tried seeing if that was the case, and it was! Whenever the input size is not at least 2n-1, it goes to infinity, and if it is at least 2n-1, it doesn't go to infinity. I partially understand why this might be—there needs to be space to insert blanks. However, in my case, there is never the need for blanks, because there are never two consecutive symbols. The output is always, [symbol, spatial relationship between symbols, symbol, spatial relationship between symbols, etc.], so even if someone were to write "11", the two "1"s would be separated without the need for a blank. I thought that there doesn't need to be a blank between every symbol, just between repeating symbols, but it seems like I'm wrong?

I am using the keras.backend.ctc_batch_cost, if that makes any difference.

edited Mar 23 '23 at 19:13

asked Mar 23 '23 at 18:53

Aiden Yun

Why does CTC loss = infinity when the input size is not two times greater than output size?

0 Answers0