2

I am building a chat-bot with a sequence to sequence encoder decoder model as in NMT. From the data given I can understand that when training they feed the decoder outputs into the decoder inputs along with the encoder cell states. I cannot figure out that when i am actually deploying a chatbot in real time, how what should I input into the decoder since that time is the output that i have to predict. Can someone help me out with this please?

Maxim
  • 52,561
  • 27
  • 155
  • 209
Subham Mukherjee
  • 779
  • 1
  • 7
  • 13

1 Answers1

1

The exact answer depends on which building blocks you take from Neural Machine Translation model (NMT) and which ones you would replace with your own. I assume the graph structure exactly as in NMT.

If so, at inference time, you can feed just a vector of zeros to the decoder.


Internal details: NMT uses the entity called Helper to determine the next input in the decoder (see tf.contrib.seq2seq.Helper documentation).

In particular, tf.contrib.seq2seq.BasicDecoder relies solely on helper when it performs a step: the next_inputs that the are fed in to the subsequent cell is exactly the return value of Helper.next_inputs().

There are different implementations of Helper interface, e.g.,

The code is in BaseModel._build_decoder method. Note that both GreedyEmbeddingHelper and SampleEmbeddingHelper don't care what the decoder input is. So in fact you can feed anything, but the zero tensor is the standard choice.

Maxim
  • 52,561
  • 27
  • 155
  • 209
  • If we feed in a vector of zeros as the 'start token' at inference time, do we need to prepend zeros to the target words during the training process for consistency? – Eweler Mar 30 '18 at 04:55
  • In inference, in contrast with training, the input vector is not used. In training it is used, so it must be sensible – Maxim Mar 30 '18 at 06:22
  • So e.g., if a single example of my tokenized training inputs are [1,2,3,4], should I add the start token 0 to the training inputs to make them [0,1,2,3,4] for correct behaviour? Given that we prepend zeros to the start during inference? – Eweler Mar 30 '18 at 07:14