0

I read about the Keras implementation of dropout and it seems to be using the inverse dropout version of it even though it says dropout.

Here is what I have understood when I read the Keras and Tensorflow documentation :

When I specify Dropout(0.4) the 0.4 means that every node in that layer has a 40% chance of being dropped which means that 0.4 is the drop probability. So by the concept of inverse dropout the outputs of the remaining neurons are scaled by a factor of 1/0.6 since the keep probability is 0.6.

(Please point out if my interpretation is incorrect. My whole doubt is based on this interpretation.)

On the other hand in TensorFlow it simply asks for the keep probability directly meaning if I specify a value of 0.4 each node has a 60% chance of being dropped.

So what happens when I am using TensorFlow on the backend of Keras? Does it take 0.4 to be the keep or the drop probability ?

(Using Python 3.6 with latest version of all required libraries)

petezurich
  • 9,280
  • 9
  • 43
  • 57
Tanmay Bhatnagar
  • 2,330
  • 4
  • 30
  • 50

2 Answers2

0

Dropout(0.4) in a Keras layer means 40% of your neurons are being dropped (not kept).

From Keras documentation:

Dropout consists in randomly setting a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting.

petezurich
  • 9,280
  • 9
  • 43
  • 57
  • I clearly understood that as you can see in the question I have specified the very same thing you specified. But that is NOT what I am asking. TensorFlow implements inverse dropout but asks for keep probability so if I say dropout(0.4) in tensorflow it the node will have 60% chance of being dropped. MY QUESTION : If I use TensorFlow on the backend of Keras and specify dropout(0.4) THEN does the node have a 40% chance of being dropped or 40% chance of being kept? – Tanmay Bhatnagar Sep 27 '17 at 19:08
  • Thanks for your comment. I think I understood you correctly. I perceive Keras as an API that abstracts away a lot of the backend in order to provide a unified interface regardless of your backend. So in almost all regards you wouldn´t notice any difference in Keras models whatever backend you use. But anyway: A simple method to check for yourself would be to set the dropout extremely high, say .99 or even 1.0. You will see that this amount of neurons is dropped (not kept). – petezurich Sep 27 '17 at 20:46
0

Looking at the source code (Line 72) of the Dropout layer can also help determine the answer.

Dropout consists in randomly setting a fraction "rate" of input units to 0 at each update during training time, which helps prevent overfitting.

The source also notes a reference paper (which I assume exactly outlines how keras implements dropout) found here written by Nitish Srivastava et. al.


Though reading the source a bit more, it looks like it calls the backend implementation of droput around line 107

return K.dropout(inputs, self.rate, noise_shape, seed=self.seed)

Where K is the backend. It might be worth it to look into how K.dropout is implemented in your backend of choice if you're still curious.

KDecker
  • 6,928
  • 8
  • 40
  • 81
  • I mentioned in my question that TensorFlow implements inverse dropout but asks for keep probability ,i.e, if i say specify 0.4 then that node has a 60% chance of being dropped. BUT on the other hand if I specify the same in keras that very node has a 40% chance of being dropped. SO IF I USE TENSORFLOW on the backend of keras and i specify dropout(0.4) then WHAT IS THE PROBABILITY of that node being dropped 40% or 60%? – Tanmay Bhatnagar Sep 27 '17 at 19:04
  • "Dropout consists in randomly setting a fraction "rate" of input units to 0 at each update during training time, which helps prevent overfitting." – KDecker Sep 28 '17 at 13:14