32

There are several classes in tf.nn that relate to RNNs. In the examples I find on the web, tf.nn.dynamic_rnn and tf.nn.rnn seem to be used interchangeably or at least I cannot seem to figure out why one is used in place of the other. What is the difference?

nbro
  • 15,395
  • 32
  • 113
  • 196
Mad Wombat
  • 14,490
  • 14
  • 73
  • 109
  • 1
    See also this SO post https://stackoverflow.com/q/42497216/3924118, where the author asks about the equivalent function of `tf.nn.rnn` for more recent versions of TensorFlow, which seems to be `tf.nn.static_rnn`. – nbro Jan 06 '18 at 04:52

2 Answers2

49

From RNNs in Tensorflow, a Practical Guide and Undocumented Features by Denny Britz, published in August 21, 2016.

tf.nn.rnn creates an unrolled graph for a fixed RNN length. That means, if you call tf.nn.rnn with inputs having 200 time steps you are creating a static graph with 200 RNN steps. First, graph creation is slow. Second, you’re unable to pass in longer sequences (> 200) than you’ve originally specified.

tf.nn.dynamic_rnn solves this. It uses a tf.While loop to dynamically construct the graph when it is executed. That means graph creation is faster and you can feed batches of variable size.

nbro
  • 15,395
  • 32
  • 113
  • 196
Abhishek Mishra
  • 1,984
  • 1
  • 18
  • 13
  • 14
    Why would one still use static RNN if the dynamic RNN provides all the advantages with practically no downsides? – xji Jan 19 '18 at 22:46
  • Did you mean to say to "feed different sequence length"? As far as I know one can easily feed different batches in any graph, just declare proper placeholders. – user1700890 May 23 '18 at 20:45
2

They are nearly the same, but there is a little difference in the structure of input and output. From documentation:

tf.nn.dynamic_rnn

This function is functionally identical to the function rnn above, but >performs fully dynamic unrolling of inputs.

Unlike rnn, the input inputs is not a Python list of Tensors, one for each frame. Instead, inputs may be a single Tensor where the maximum time is either the first or second dimension (see the parameter time_major). Alternatively, it may be a (possibly nested) tuple of Tensors, each of them having matching batch and time dimensions. The corresponding output is either a single Tensor having the same number of time steps and batch size, or a (possibly nested) tuple of such tensors, matching the nested structure of cell.output_size.

For more details, explore source.

nbro
  • 15,395
  • 32
  • 113
  • 196
Dmitriy Danevskiy
  • 3,119
  • 1
  • 11
  • 15