I read about the Keras implementation of dropout and it seems to be using the inverse dropout version of it even though it says dropout.
Here is what I have understood when I read the Keras and Tensorflow documentation :
When I specify Dropout(0.4)
the 0.4 means that every node in that layer has a 40% chance of being dropped which means that 0.4 is the drop probability. So by the concept of inverse dropout the outputs of the remaining neurons are scaled by a factor of 1/0.6 since the keep probability is 0.6.
(Please point out if my interpretation is incorrect. My whole doubt is based on this interpretation.)
On the other hand in TensorFlow it simply asks for the keep probability directly meaning if I specify a value of 0.4 each node has a 60% chance of being dropped.
So what happens when I am using TensorFlow on the backend of Keras? Does it take 0.4 to be the keep or the drop probability ?
(Using Python 3.6 with latest version of all required libraries)