How to deactivate a dropout layer called with training=True in a Keras model?

Question

I wish to view the final output of training a tf.keras model. In this case it would be an array of predictions from the softmax function, e.g. [0,0,0,1,0,1].

Other threads on here have suggested using model.predict(training_data), but this won't work for my situation since I am using dropout at training and validation, so neurons are randomly dropped and predicting again with the same data will give a different result.

def get_model():
    inputs = tf.keras.layers.Input(shape=(input_dims,))
    x = tf.keras.layers.Dropout(rate=dropout_rate)(inputs, training=True)
    x = tf.keras.layers.Dense(units=29, activation='relu')(x)
    x = tf.keras.layers.Dropout(rate=dropout_rate)(x, training=True)  
    x = tf.keras.layers.Dense(units=15, activation='relu')(x)
    outputs = tf.keras.layers.Dense(2, activation='softmax')(x)
    model = tf.keras.Model(inputs=inputs, outputs=outputs)
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',      
                  metrics=['sparse_categorical_accuracy'])
    return model

myModel = get_model()
myModel.summary()
myModel.fit(X_train, y_train,
           batch_size = batch_size,
           epochs= epochs,
           verbose = 1,
           validation_data = (X_val, y_val))

In tensorflow, you can grab the output of a model after training quite easily. Here is an example from a Github repo:

input = tf.placeholder(tf.float32, shape=[None, INPUT_DIMS])
labels = tf.placeholder(tf.float32, shape=[None])

hidden = tf.nn.tanh(make_nn_layer(normalized, NUM_HIDDEN))
logits = make_nn_layer(hidden, NUM_CLASSES)
outputs = tf.argmax(logits, 1)

int_labels = tf.to_int64(labels)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, int_labels, name='xentropy')
train_step = tf.train.AdamOptimizer().minimize(cross_entropy)

correct_prediction = tf.equal(outputs, int_labels)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())

    validation_dict = {
        input: validation_data[:,0:7],
        labels: validation_data[:,7],}

    for i in range(NUM_BATCHES):
        batch = training_data[numpy.random.choice(training_size, BATCH_SIZE, False),:]
        train_step.run({input: batch[:,0:7], labels: batch[:,7]})

        if i % 100 == 0 or i == NUM_BATCHES - 1:
            print('Accuracy %.2f%% at step %d' % (accuracy.eval(validation_dict) * 100, i))

    output_data = outputs.eval({input: data_vector[:,0:7]})

The only output I can get from the trained model appears to be a history object. There is also a myModel.output object, but it is a tensor that I can't evaluate without putting data into it. Any ideas?

Are you asking for how to get a visual model of your model? Or are you looking for the output data, something that would look like `x = 0, y = 1`? — , Aug 09 '19 at 23:08
@Cygnus Yeah I'm looking for the output data (the predictions), similar to how model.predict() would work. — la_leche, Aug 09 '19 at 23:21

today · Accepted Answer · 2019-08-10T04:07:39.267

As far as I know, you can't turn off the dropout after passing training=True when calling the layers (unless you transfer the weights to a new model with the same architecture). However, instead you can build and train your model in normal case (i.e. without using training argument in the calls) and then selectively turn on and off the dropout layer in test phase by defining a backend function (i.e. keras.backend.function()) and setting the learning phase (i.e. keras.backend.learning_phase()):

# build your model normally (i.e. without using `training=True` argument)

# train your model...

from keras import backend as K

func = K.function(model.inputs + [K.learning_phase()], model.outputs)

# run the model with dropout layers being active, i.e. learning_phase == 1
preds = func(list_of_input_arrays + [1])

# run the model with dropout layers being inactive, i.e. learning_phase == 0
preds = func(list_of_input_arrays + [0])

Update: As I suggested above, another approach is to define a new model with the same architecture but without setting training=True, and then transfer the weights from the trained model to this new model. To achieve this, I just add a training argument to your get_model() function:

def get_model(training=None):
    inputs = tf.keras.layers.Input(shape=(input_dims,))
    x = tf.keras.layers.Dropout(rate=dropout_rate)(inputs, training=training)
    x = tf.keras.layers.Dense(units=29, activation='relu')(x)
    x = tf.keras.layers.Dropout(rate=dropout_rate)(x, training=training)  
    x = tf.keras.layers.Dense(units=15, activation='relu')(x)
    outputs = tf.keras.layers.Dense(2, activation='softmax')(x)
    model = tf.keras.Model(inputs=inputs, outputs=outputs)
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',      
                  metrics=['sparse_categorical_accuracy'])
    return model

# build a model with dropout layers active in both training and test phases
myModel = get_model(training=True)
# train the model
myModel.fit(...)

# build a clone of the model with dropouts deactivated in test phase
myTestModel = get_model()  # note: the `training` is `None` by default
# transfer the weights from the trained model to this model
myTestModel.set_weights(myModel.get_weights())
# use the new model in test phase; the dropouts would not be active
myTestModel.predict(...)

Awesome, thanks. I just tested the second solution you offered with the training data and the accuracy I get from prediction is the same as the accuracy reported in training. — la_leche, Aug 10 '19 at 19:27
Actually, @today it appears in my test the accuracy values were coincidentally equal up to the 4th digit. But when shuffling the dataset the accuracy values are not the same, meaning the output you get using `myTestModel.predict()` is not the same output generated during training of `myModel`. You can see using some toy data: `from sklearn.datasets import make_circles # generate 2d classification dataset X, y = make_circles(n_samples=1000, noise=0.05)` — la_leche, Aug 12 '19 at 20:01
@la_leche I'm sorry but I could not understand your points. What do you mean by shuffling the dataset affecting the accuracy? And how do you get the **output** of `myModel` during training? Don't forget that the accuracy logged in the progress bar is the average of all the previous batch accuracy values, and after each batch the model weights change due to back-propagation. Therefore you cannot compare accuracy values printed in log bar during training time with the accuracy you got in prediction time. Also I don't understand what you mean by "values were **coincidentally** equal"? equal with? — today, Aug 13 '19 at 05:18
Ohh I see. Yes I was comparing the accuracy of myTestModel to the progress bar. By "coincidentally" I meant when I first made the test, the progress bar reported acc = 0.9871 and the sklearn acc = 0.98702... They seemed identical but it was just rounding. Testing with other data would give values like 0.9527 and 0.9243, which are a bit different. Thank you for pointing out the progress bar point, it does seem like your solution is what I want after all. — la_leche, Aug 13 '19 at 16:02
Thank you for this solution. I wondered how you would plot the predictions along with the confidence intervals in this situation? the model generates a 2d numpy array for the mean and std - I am not sure what these signifies? — cmp, Dec 06 '19 at 23:05

How to deactivate a dropout layer called with training=True in a Keras model?

1 Answers1

Linked