Refer to this post to know the background of the problem: Does the TensorFlow embedding_attention_seq2seq method implement a bidirectional RNN Encoder by default?
I am working on the same model, and want to replace the unidirectional LSTM layer with a Bidirectional layer. I realize I have to use static_bidirectional_rnn instead of static_rnn, but I am getting an error due to some mismatch in the tensor shape.
I replaced the following line:
encoder_outputs, encoder_state = core_rnn.static_rnn(encoder_cell, encoder_inputs, dtype=dtype)
with the line below:
encoder_outputs, encoder_state_fw, encoder_state_bw = core_rnn.static_bidirectional_rnn(encoder_cell, encoder_cell, encoder_inputs, dtype=dtype)
That gives me the following error:
InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5,1,256] vs. [16,1,1,256] [[Node: gradients/model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/add_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/add_grad/Shape, gradients/model_with_buckets/embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/add_grad/Shape_1)]]
I understand that the outputs of both the methods are different, but I do not know how to modify attention code to incorporate that. How do I send both the forward and backward states to the attention module- do I concatenate both the hidden states?