0

I would like to apply dropout to the outputs from an RNN. For example, in Tensorflow 1.8.0, I could do this:

import tensorflow as tf
import tensorflow.contrib.eager as tfe

tfe.enable_eager_execution()

x = tf.random_uniform((10, 5, 3))

gru_cell1 = tf.contrib.rnn.GRUCell(2)
gru_cell1 = tf.contrib.rnn.DropoutWrapper(gru_cell1, output_keep_prob=0.5)
cell = tf.contrib.rnn.MultiRNNCell([gru_cell1])
init_state = cell.zero_state(10, tf.float32)

cell_output, _ = tf.nn.dynamic_rnn(cell, x,
                                   initial_state=init_state, time_major=False)
cell_output

How can I achieve the same thing using the Keras API?

I have thought of the following two ways but they were unsuccessful:

import tensorflow as tf
import tensorflow.contrib.eager as tfe

tfe.enable_eager_execution()

# Attempt 1
x = tf.random_uniform((10, 5, 3))

gru_layer = tf.keras.layers.GRU(2, return_sequences=True, input_shape=(10, 5, 3))
gru_layer = tf.keras.layers.Dropout(0.5)(gru_layer)

# Gives the following error:
# ValueError: Attempt to convert a value (<tensorflow.python.keras._impl.keras.layers.recurrent.GRU object
#  at 0x000001C520681F60>) with an unsupported type (<class 'tensorflow.python.keras._impl.keras.layers.recurrent.GRU'>) 
# to a Tensor.

# Attempt 2
x = tf.random_uniform((10, 5, 3))

gru_layer = tf.keras.layers.GRU(2, return_sequences=True, input_shape=(10, 5, 3))
gru_layer = tf.keras.layers.TimeDistributed(tf.keras.layers.Dropout(0.4))(gru_layer)

# Gives the following error:
# ValueError: as_list() is not defined on an unknown TensorShape.
mauna
  • 1,098
  • 13
  • 25
  • You're probably missing the input shape parameters to keras layers – BallpointBen May 31 '18 at 20:02
  • @BallpointBent thanks for your input. I've tried it but it still gives the same error. – mauna May 31 '18 at 20:08
  • Why not just use `keras.layers` with the tensorflow backend? – BallpointBen May 31 '18 at 20:09
  • @BallpointBen Because I find it easier to debug my model when working in Eager mode. – mauna May 31 '18 at 20:13
  • From the code your post here, don't see how `x` is connected with the rest. According to Keras [Model (functional API)](https://keras.io/models/model/), neural nets usually start with the `Input` layers. You chain the layers up. Then you create the `Model` from the inputs and outputs. Then you compile the model. Then you call `fit()` on the model. That's when you pass in `x` and `y`. – neurite May 31 '18 at 21:40

1 Answers1

0

To get the model output, without training, like you're doing in the TF code, the following code should work. Indeed, you need an Input layer, and to hook each layer to the previous one, and a Model as well:

import numpy as np
from keras.models import Model
from keras.layers import Dropout, GRU, Input

x = np.random.randn(10, 5, 3)

inputs = Input(shape=(5, 3))
gru_layer = GRU(2, return_sequences=True)(inputs)
gru_layer = Dropout(0.5)(gru_layer)

model = Model(inputs=inputs, outputs=gru_layer)

output = model.predict(x)
antishok
  • 2,910
  • 16
  • 21
  • 2
    There is a slight problem with this, this applies dropout across all timesteps meaning that some timesteps can be completely dropped. To dropout every timestep independently, you need to specify `noise_shape=(batch_size, 1, features)` [doc](https://keras.io/layers/core/#dropout). – nuric Jun 01 '18 at 10:39