mxnet: multiple dropout layers with shared mask

Question

I'd like to reproduce a recurrent neural network where each time layer is followed by a dropout layer, and these dropout layers share their masks. This structure was described in, among others, A Theoretically Grounded Application of Dropout in Recurrent Neural Networks.

As far as I understand the code, the recurrent network models implemented in MXNet do not have any dropout layers applied between time layers; the dropout parameter of functions such as lstm (R API, Python API) actually defines dropout on the input. Therefore I'd need to reimplement these functions from scratch.

However, the Dropout layer does not seem to take a variable that defines mask as a parameter.

Is it possible to make multiple dropout layers in different places of the computation graph, yet sharing their masks?

score 1 · Answer 1 · answered Jan 08 '18 at 07:42

1

According to the discussion here, it is not possible to specify the mask and using random seed does not have an impact on dropout's random number generator.

answered Jan 08 '18 at 07:42

Sina Afrooze

960
6
11

There is work to fix the issue with RNG, but you still cannot specify a mask. – Sina Afrooze Jan 12 '18 at 03:44

mxnet: multiple dropout layers with shared mask

1 Answers1