How to connect multi-layered Bi-directional LSTM encoder to a decoder?

Question

I'm making a seq2seq model which uses a Bi-LSTM as encoder and Attention mechanism in decoder. For a single layer of LSTM model is working fine. My encoder looks something like this.

Encoder:

def encoding_layer(self, rnn_inputs, rnn_size, num_layers, keep_prob, 
                   source_vocab_size, 
                   encoding_embedding_size,
                   source_sequence_length,
                   emb_matrix):

    embed = tf.nn.embedding_lookup(emb_matrix, rnn_inputs)   
    stacked_cells = tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob)
    outputs, state = tf.nn.bidirectional_dynamic_rnn(cell_fw=stacked_cells, 
                                                             cell_bw=stacked_cells, 
                                                             inputs=embed, 
                                                             sequence_length=source_sequence_length, 
                                                             dtype=tf.float32)

    concat_outputs = tf.concat(outputs, 2)
    cell_state_fw, cell_state_bw = state
    cell_state_final = tf.concat([cell_state_fw.c, cell_state_bw.c], 1)
    hidden_state_final = tf.concat([cell_state_fw.h, cell_state_bw.h], 1)
    encoder_final_state = tf.nn.rnn_cell.LSTMStateTuple(c=cell_state_final, h=hidden_state_final)

    return concat_outputs, encoder_final_state

Decoder :

    def decoding_layer_train(self, encoder_outputs, encoder_state, dec_cell, dec_embed_input, 
                         target_sequence_length, max_summary_length, 
                         output_layer, keep_prob, rnn_size, batch_size):

    rnn_size = 2 * rnn_size
    dec_cell = tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob)


    train_helper = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, target_sequence_length)

    attention_mechanism = tf.contrib.seq2seq.BahdanauAttention(rnn_size, encoder_outputs,
                                                               memory_sequence_length=target_sequence_length)

    attention_cell = tf.contrib.seq2seq.AttentionWrapper(dec_cell, attention_mechanism,
                                                         attention_layer_size=rnn_size/2)

    state = attention_cell.zero_state(dtype=tf.float32, batch_size=batch_size)
    state = state.clone(cell_state=encoder_state)

    decoder = tf.contrib.seq2seq.BasicDecoder(cell=attention_cell, helper=train_helper, 
                                              initial_state=state,
                                              output_layer=output_layer) 
    outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder, impute_finished=True, maximum_iterations=max_summary_length)

    return outputs

With above configuration of single layer Bi-LSTM my model is working fine. But, now I want to use a multilayered Bi-LSTM encoder and decoder. So, in encoder and decoder if I change the cell to:

stacked_cells = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob) for _ in range(num_layers)])

After changing cell I am getting this error:

AttributeError: 'tuple' object has no attribute 'c'

here, num_layers = 2

rnn_size = 128

embedding_size = 50

So, I want to know what exactly is returned as state in second case. And how to pass that state to decoder.

Full code: https://github.com/sainimohit23/Text-Summarization

So, you want to pass all states (from all time steps) to the decoder, not just the final state, right? — kafman, Feb 07 '19 at 11:49
@kaufmanu no. Take a look at my encoder. Bi-LSTM returns final_state_forward and final_state_backward. Right now I'm passing only final state forward. I want to pass both states. I tried to concatenate them and doubled the decoder rnn_size. But, it's not working. I have also tried this: https://codeshare.io/2BXNqp — h s, Feb 07 '19 at 13:19
@kaufmanu btw program is working only with one state(i.e. forward_state). Code on github repo is working using only forward state. — h s, Feb 07 '19 at 13:31
There is some code missing to really see what is going on (please paste the relevant snippets directly in your post, i.e. don't link to the entire code base). There's several ways to do achieve what you want. The snippet on codeshare seems reasonable, so what exactly is the error message and what do you mean when you say "its's not working"? One guess: you're encoder is most likely to return an LSTMStateTuple, which is not compatible with the BahdanauAttention layer. But to verify this you should provide a minimum runnable example that produces the error in question. — kafman, Feb 07 '19 at 15:53
Thanks for the update, but please provide a _minimal, runnable_ code snippet that reproduces the error. Please also read [how to ask a good question in the help center](https://stackoverflow.com/help/how-to-ask). — kafman, Feb 08 '19 at 08:57

How to connect multi-layered Bi-directional LSTM encoder to a decoder?

0 Answers0