4

I'm a beginner in tensorflow. I want to study tensorflow by using this tutorial.

After reading this tutorial, I want to run this code by using my data(Korea title for tokenizing) In training model(use TrainingHelper), the prediction results seems to be OK. But In inference model(use GreedyEmbeddingHelper), the prediction results are really bad(even though using train data). It looks like first epoch's training model prediction. Is there any difference TrainingHelper and GreedyEmbeddingHelper?

I think the difference between tutorial and my code is just hyper-parameter.

The Hungry Dictator
  • 3,444
  • 5
  • 37
  • 53

2 Answers2

6

TrainingHelper is for use at training time, when (one of the) inputs to your decoder RNN is the ground truth from the previous time step. Because the ground truth is not available at inference time, you instead feed in the decoder output from the previous time step.

For example, consider the target sentence "I like pizza". At training time, when decoding the word "pizza", the decoding RNN will receive the following inputs:

  1. The ground truth from the previous time step, e.g. the embedding for the word "like" (using the target embedding).
  2. The context from the previous time step.
  3. The hidden state from the previous time step.

At inference time, the decoding RNN will still receive 2 and 3. However, instead of the ground truth, it will take the decoder output from the previous time step (a one-hot encoding equal to the length of the target vocabulary, e.g. the word your decoder guessed at the previous time step), run it through the target embedding, and use that as an input instead.

Brian Barnes
  • 367
  • 2
  • 11
  • Thanks for your kind explanation and examples. As I understand your explanation, I think I need to more learning data or training time for predict the next correct word. Thanks a lot ^^ – Tae-suk Kim Jun 26 '17 at 08:11
  • Sorry to necro this. Is it possible to use GreedyDecoder instead of TrainingHelper for a specific case? I realize training will be slowed...is it viable? See: https://stackoverflow.com/questions/48256372/neural-machine-translation-model-predictions-are-off-by-one – Evan Weissburg Jan 15 '18 at 18:38
1

Minute 28 in this Tensorflow summit talk provides some color on the helper classes. As mentioned in Brian's answer - GreedyEmbeddingHelper is meant for prediction time, when ground truth is not available as input. But you can also take a look at ScheduledEmbeddingTrainingHelper, if you want a more nuanced helper at training time

Arsene Lupin
  • 343
  • 2
  • 11