How to use tf.contrib.seq2seq.BahdanauAttention

Question

I am trying to produce a simple code for a seq2seq model with attention in tf 1.1. I am not sure what is the parameter "depth of query mechanism ". I am getting an error on creation of Attention Mechanisms saying that:

TypeError: int() argument must be a string, a bytes-like object or a number, not 'TensorShape'

Here is my code. Am I on a right track? I could not find any detailed documentation.

import tensorflow as tf
from tensorflow.contrib.rnn import LSTMCell, LSTMStateTuple, BasicLSTMCell, DropoutWrapper, MultiRNNCell, EmbeddingWrapper, static_rnn 
import tensorflow.contrib.seq2seq as seq2seq
import attention_wrapper as wrapper


tf.reset_default_graph()
try:
    sess.close()
except:

    pass
sess = tf.InteractiveSession()


## Place holders

encode_input = [tf.placeholder(tf.int32, 
                                shape=(None,),
                                name = "ei_%i" %i)
                                for i in range(input_seq_length)]

labels = [tf.placeholder(tf.int32,
                                shape=(None,),
                                name = "l_%i" %i)
                                for i in range(output_seq_length)]

decode_input = [tf.zeros_like(encode_input[0], dtype=np.int32, name="GO")] + labels[:-1]



############ Encoder
lstm_cell = BasicLSTMCell(embedding_dim)
encoder_cell = EmbeddingWrapper(lstm_cell, embedding_classes=input_vocab_size, embedding_size=embedding_dim)
encoder_outputs, encoder_state = static_rnn(encoder_cell, encode_input, dtype=tf.float32) 

############ Decoder
# Attention Mechanisms. Bahdanau is additive style attention
attn_mech = tf.contrib.seq2seq.BahdanauAttention(
    num_units = input_seq_length, # depth of query mechanism
    memory = encoder_outputs, # hidden states to attend (output of RNN)
    normalize=False, # normalize energy term
    name='BahdanauAttention')

lstm_cell_decoder = BasicLSTMCell(embedding_dim)

# Attention Wrapper: adds the attention mechanism to the cell
attn_cell = wrapper.AttentionWrapper(
    cell = lstm_cell_decoder,# Instance of RNNCell
    attention_mechanism = attn_mech, # Instance of AttentionMechanism
    attention_size = embedding_dim, # Int, depth of attention (output) tensor
    attention_history=False, # whether to store history in final output
    name="attention_wrapper")


# Decoder setup
decoder = tf.contrib.seq2seq.BasicDecoder(
          cell = lstm_cell_decoder,
          helper = helper, # A Helper instance
          initial_state = encoder_state, # initial state of decoder
          output_layer = None) # instance of tf.layers.Layer, like Dense

# Perform dynamic decoding with decoder object
outputs, final_state = tf.contrib.seq2seq.dynamic_decode(decoder)

I think your code is not complete, maybe you forgot to add some details. My friend and I have successfully wrote a working seq2seq using tensorflow 1.1 api and although your approach is generally ok, you have some mistakes in using some methods. you can read tensorflow source code to find out what are your mistakes, or email me to send you my code. — Iman Mirzadeh, May 09 '17 at 23:33
@E.Asgari, could you answer your own question with the answer from Iman ? Thank you! — Maxime De Bruyn, Jun 12 '17 at 14:38
They jut prepared a detailed tutorial https://github.com/tensorflow/nmt — E.Asgari, Jul 13 '17 at 05:43

How to use tf.contrib.seq2seq.BahdanauAttention

0 Answers0