Seq2Seq Models for Chatbots

Question

I am building a chat-bot with a sequence to sequence encoder decoder model as in NMT. From the data given I can understand that when training they feed the decoder outputs into the decoder inputs along with the encoder cell states. I cannot figure out that when i am actually deploying a chatbot in real time, how what should I input into the decoder since that time is the output that i have to predict. Can someone help me out with this please?

I am also follow that https://github.com/tensorflow/nmt and I have same problem like that can you find solusion?? — Jignasha Royala, Jan 15 '18 at 11:53

score 1 · Answer 1 · answered Feb 02 '18 at 19:43

The exact answer depends on which building blocks you take from Neural Machine Translation model (NMT) and which ones you would replace with your own. I assume the graph structure exactly as in NMT.

If so, at inference time, you can feed just a vector of zeros to the decoder.

Internal details: NMT uses the entity called Helper to determine the next input in the decoder (see tf.contrib.seq2seq.Helper documentation).

In particular, tf.contrib.seq2seq.BasicDecoder relies solely on helper when it performs a step: the next_inputs that the are fed in to the subsequent cell is exactly the return value of Helper.next_inputs().

There are different implementations of Helper interface, e.g.,

tf.contrib.seq2seq.TrainingHelper is returning the next decoder input (which is usually ground truth). This helper is used in training as indicated in the tutorial.
tf.contrib.seq2seq.GreedyEmbeddingHelper discards the inputs, and returns the argmax sampled token from the previous output. NMT uses this helper in inference when sampling_temperature hyper-parameter is 0.
tf.contrib.seq2seq.SampleEmbeddingHelper does the same, but samples the token according to categorical (a.k.a. generalized Bernoulli) distribution. NMT uses this helper in inference when sampling_temperature > 0.
...

The code is in BaseModel._build_decoder method. Note that both GreedyEmbeddingHelper and SampleEmbeddingHelper don't care what the decoder input is. So in fact you can feed anything, but the zero tensor is the standard choice.

If we feed in a vector of zeros as the 'start token' at inference time, do we need to prepend zeros to the target words during the training process for consistency? — Eweler, Mar 30 '18 at 04:55
In inference, in contrast with training, the input vector is not used. In training it is used, so it must be sensible — Maxim, Mar 30 '18 at 06:22
So e.g., if a single example of my tokenized training inputs are [1,2,3,4], should I add the start token 0 to the training inputs to make them [0,1,2,3,4] for correct behaviour? Given that we prepend zeros to the start during inference? — Eweler, Mar 30 '18 at 07:14

Seq2Seq Models for Chatbots

1 Answers1