1

I am looking to design a LSTM model using Tensorflow, wherein the sentences are of different length. I came across a tutorial on PTB dataset (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/rnn/ptb/ptb_word_lm.py). How does this model capture the instances of varying length? The example does not discuss anything about padding or other technique to handle the variable size sequences.

If I use padding, what should be the unrolling dimension?

user3480922
  • 564
  • 1
  • 10
  • 22

2 Answers2

1

You can do this in two way.

  1. TF has a way to specify the input size. Look for a parameter called "sequence_length", I have used this in tf.nn.bidirectional_rnn. So the TF will unroll your cell only up to sequence_length but not to the step size.

  2. Pad your input with predefined dummy input and predefined dummy output (for the dummy output). The lstm cell will learn to predict dummy output for the dummy input. When using it (say for matrix calculation) chop of the dummy parts.

mujjiga
  • 16,186
  • 2
  • 33
  • 51
0

The PTB model is truncated in time -- it always back-propagates a fixed number of steps (num_steps in the configs). So there is no padding -- it just reads the data and tries to predict the next word, and always reads num_steps words at a time.

Lukasz Kaiser
  • 2,186
  • 12
  • 4
  • Yes, I too get it from the the code flow. However, I need to use a sequence prediction which takes input of varying length. If I pad shorter sentences with 0s, how can I back propagate with 0s in the input or expected labels in output sequence? – user3480922 Jul 19 '16 at 09:04