CTC or “connectionist temporal classification” is a machine learning technique for mapping dense input data to shorter output sequences in the same order.
Questions tagged [ctc]
85 questions
1
vote
1 answer
Can RNN / LSTM be used for non standard text OCR?
I have read about LSTMs and RNNs, even CTC. From what I understand, RNN is used to figure a missing token in a sequence (e.g. a word in a sentence). However, my problem is reading person names written in cursive script. Many names are not popular…

FindOutIslamNow
- 1,169
- 1
- 14
- 33
1
vote
1 answer
Understanding how TF implemention for CTC works
I'm trying to understand how CTC implementation works in TensorFlow. I've wrote a quick example just to test CTC function, but for some reason I'm gettign inf for some target/input values and I'm sure why is that happing!?
Code:
import tensorflow as…

Ehab AlBadawy
- 3,065
- 4
- 19
- 31
1
vote
0 answers
CTC loss for keras
How to use tensorflows CTC loss function in keras?
I have tried doing it like this:
def ctc_loss(y_true,y_pred):
return(tf.nn.ctc_loss(y_pred, y_true, 64,
preprocess_collapse_repeated=False, ctc_merge_repeated=False,
…

Subham Mukherjee
- 779
- 1
- 7
- 13
0
votes
0 answers
How to train KenLM language model for Nvidia's QuartzNet?
I am trying to train a speech-to-text model for the Armenian language. After I am using the Nvidia NeMo toolkit. After training the acoustic model I used provided NeMo/scripts/asr_language_modeling/ngram_lm/train_kenlm.py file to train the language…

arm
- 56
- 1
- 12
0
votes
0 answers
RuntimeError: real is not implemented for tensors with non-complex dtypes
I have installed warpctc module. Then inside home/sultan/Desktop/Adversarial-ASR-Attack/venv/lib/python3.8/sitepackages/art/estimators/speech_recognition/pytorch_deep_speech.py" I have written the following code:
def magphase ( D, *, power= 1 ):
mag…

Christopher Marlowe
- 2,098
- 6
- 38
- 68
0
votes
0 answers
How to pad input sequences for Connectionist Temporal Classification (CTC)?
I am training a model with CTC, and I need to pad the input sequences for batches. However, the input length has to be at least 2*(output length)-1 because CTC has to output blanks between every output symbol. If I were to pad the output with pad…

Aiden Yun
- 53
- 6
0
votes
0 answers
CTC Loss Function
I have used CNN and LSTM architecture for my model training
CTC loss
Got this error in model.fit
I am using Timit dataset for speech recognition. I used spectrogram of phoneme as an input. I have used CNN+LSTM architecture and CTC loss for model…
0
votes
0 answers
Why does CTC loss = infinity when the input size is not two times greater than output size?
I am implementing a handwriting recognition model and using CTC with LSTMs. I saw a discussion on GitHub saying that the input size must be at least 2n-1 where n is the output size. I tried seeing if that was the case, and it was! Whenever the input…

Aiden Yun
- 53
- 6
0
votes
1 answer
How to solve infinity/NaN loss for CTC? Is there something similar to zero_infinity in Pytorch for Keras/Tensorflow?
I am using a CTC loss for handwriting recognition in Tensorflow/Keras. However, just a few seconds after the model starts fitting, the loss goes to infinity.
I think this is because the input size isn't much bigger than the output size. Pytorch has…

Aiden Yun
- 53
- 6
0
votes
1 answer
TypeError: max() received an invalid combination of arguments when trying to use beam search decoding
I'm trying to run simple example of decode WAV2VEC2 output with beam search (without LM):
from pyctcdecode import build_ctcdecoder
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
from torchaudio.utils import…

user3668129
- 4,318
- 6
- 45
- 87
0
votes
0 answers
Padding time dimension in softmax output for CTC loss
Network:
Input sequence -> BiLSTM---------> BiLSTM --------> Dense with softmax
Output shapes: (None, 5, 256) (None, 5, 128) (None, 5, 11)
Here is my CTC loss:
def calculate_ctc_loss(y_true, y_pred):
batch_length =…

enterML
- 2,110
- 4
- 26
- 38
0
votes
1 answer
Automatic Speech Recognition CTC Loss Suddenly Goes to Infinity After 100 Batches
I have been struggling to create a automatic speech recognition neural network using tensorflow trained on the hugging face mozilla common voice 11 dataset. The model seems to train well for around 100 batches before the loss sudenly goes to…

Nathan Montanez
- 9
- 1
- 1
- 2
0
votes
0 answers
Calculating accuracy on padded tensors
Greeting,
I am currently trying to develop a scene text-recognition model with a CNN (which might change to a pre-trained ResNet in the future) backbone and a LSTM for the predictions.
Skipping all the unnecessary info, my targets are in shape [16,…

Dimou_
- 1
- 2
0
votes
0 answers
I want to apply ctc loss function , i have written the code but I am having this error: Exception encountered when calling layer "ctc" (type Lambda)
enter image description here
I have written the code promptly, the error says to add lambda but as you can see it is already in the main code. It would be really helpful if anyone provided a solution with proper description of why the error…

Nilotpal Barman
- 1
- 1
0
votes
0 answers
ValueError: Exception encountered when calling layer "ctc_loss"
Tensorflow Beginner here. I got this error message and I have no clue what to do, where to look or what to change? Can someone guide me into the right direction?
Code: https://github.com/niconielsen32/NeuralNetworks/blob/main/OCRcaptchas.ipynb
…

daniel guo
- 45
- 1
- 7