NO convergence in bidirectional-lstm of Tensorflow

Question

I am training a bi-directional LSTM network, but when I train it, I got this as below:

"
Iter 3456, Minibatch Loss= 10.305597, Training Accuracy= 0.25000
Iter 3840, Minibatch Loss= 22.018646, Training Accuracy= 0.00000
Iter 4224, Minibatch Loss= 34.245750, Training Accuracy= 0.00000
Iter 4608, Minibatch Loss= 13.833059, Training Accuracy= 0.12500
Iter 4992, Minibatch Loss= 19.687658, Training Accuracy= 0.00000
"

Even iteration is 50 0000, loss and accuracy is nearly the same. My setting is below:

# Parameters
learning_rate = 0.0001
training_iters = 20000#120000
batch_size = tf.placeholder(dtype=tf.int32)#24,128
display_step = 16#10
test_num = 275#245
keep_prob = tf.placeholder("float") #probability for dropout
kp = 1.0

# Network Parameters
n_input = 240*160 #28 # MNIST data input (img shape: 28*28)
n_steps = 16 #28 # timesteps
n_hidden = 500 # hidden layer num of features
n_classes = 20

Is this the problem of techniques or schemes?

score 2 · Answer 1 · answered Jun 22 '16 at 18:28

The first thing I would try is varying the learning rate to see if you can get the loss to decrease. It may also be helpful to compare the accuracy with some baselines (e.g. are you better than predicting the most frequent class in a classification problem).

If your loss is not decreasing at all for a wide range of learning rates I would start looking for bugs in the code (e.g. is the training op that updates weights actually run, do features and labels match, is your data randomized properly, ...).

If there is a problem with the technique (bidirection LSTM) depends on what task you are trying to accomplish. If you are actually applying this to MNIST (based on the comment in your code) then I would rather recommend some convolutional and maxpooling layers than an RNN.

NO convergence in bidirectional-lstm of Tensorflow

1 Answers1