I am implementing a neural network in Lasagne, where I would like to share weights between different GRU layers (http://lasagne.readthedocs.io/en/latest/modules/layers/recurrent.html#lasagne.layers.GRULayer). In order to do this, I replace the reset, update and hidden update gates of the GRU layers by custom gates (http://lasagne.readthedocs.io/en/latest/modules/layers/recurrent.html#lasagne.layers.Gate).
In these gates, I have to define input-to-gate weights W_in and hidden-to-gate weights W_hid. What should be the dimensionality of these weights? My best guess would be that , for input data with dimensionality batch_size x input_len x num_features, dim(W_in) = num_features x num_hidden and dim(W_hid) = num_hidden x num_hidden. However, this does not work.
Does anybody have an idea? Thanks in advance!