3

i am facing an error similar to the one stated here CTC Loss InvalidArgumentError: sequence_length(b) <= time but there seems to be no explanation of what the error actually means. Based on the reading i did, does it mean that the sequence length of the example "0" in the minibatch is less than 3 ? in which case why is it an error (since as explained in the tf doc and the question above, the length of all sequences has to be lesser than time, right ?) ..could anyone kindly explain how i could debug the issue and make sense of the error? i am using an existing conv2d example and trying to merge ctc loss using some audio files i had

the code is present here https://github.com/takingstock/ServerSide-Algos/blob/master/ctc-conv.py and the problem occurs on line 213 (apologies for pasting the code github url instead of the code here..i felt it might be cleaner this way)

the stack trace

Caused by op u'CTCLoss', defined at:
File "conv_train.py", line 279, in <module>
loss = tf.nn.ctc_loss(Y , logits, seq_len)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/ctc_ops.py", line 156, in ctc_loss
ignore_longer_outputs_than_inputs=ignore_longer_outputs_than_inputs)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 224, in _ctc_loss
name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): sequence_length(0) <= 3
     [[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=false, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](transpose, _arg_Placeholder_3_0_3, _arg_Placeholder_2_0_2, _arg_Placeholder_4_0_4)]]
aL_eX
  • 1,453
  • 2
  • 15
  • 30
Vikram Murthy
  • 293
  • 6
  • 17

1 Answers1

3

turns out the error was because of the way i was feeding input into the ctc_loss function. The logits should have been in the shape [ max_timestep, batch_size, num_classes/labels ] but i was sending it the other way round. Kindly look at the updated code in the url below ..hopefully it might be of some use to some folks.

https://github.com/takingstock/ServerSide-Algos/blob/master/ctc_conv_corrected.py

to be precise , this is the part of the code that was creating issues

conv1 = conv2d(x, weights['wc1'], biases['bc1'])
# Max Pooling (down-sampling)
conv1 = maxpool2d(conv1, k=2)

# Convolution Layer
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
# Max Pooling (down-sampling)
conv2 = maxpool2d(conv2, k=2)
# Fully connected layer
# Reshape conv2 output to fit fully connected layer input
fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
fc1 = tf.nn.relu(fc1)
# Apply Dropout
fc1 = tf.nn.dropout(fc1, dropout)

if you notice, the addition of pooling reduces the dimensionality of data that needs to be input into ctc_loss. Also, in my personal experience (and quite a bit of literature i have read) pooling doesn't do much good( at least not in non image convolutions) hence i replaced the above with

x = tf.reshape(X, shape=[-1, num_features, 399 , 1])
# Convolution Layer
conv1 = conv2d(conv1, weights['wc1'], biases['bc1'], 1)
fc1 = tf.reshape(conv1, [batch_size , 399 , 
weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.layers.dense( fc1, 1024 , activation=tf.nn.relu)
# Apply Dropout
fc1 = tf.nn.dropout(fc1, keep_prob)
# Output, class prediction
logits = tf.layers.dense(inputs=fc1, units=num_classes, activation=tf.nn.relu)

logits = tf.transpose(logits, (1, 0, 2))

loss = tf.nn.ctc_loss(Y , logits, seq_len)

this way, the input going into ctc_loss has the exact required [ max_ts, batch, label] format. Also the results of using just 1 layer of conv is way superior to BiRNN (**for my data) ..also this post proved to be of immense intuitive help (for using convolutions with ctc_loss) How to use tf.nn.ctc_loss in cnn+ctc network

Vikram Murthy
  • 293
  • 6
  • 17