How to train a neural network and replicate this as blocks of a bigger network?

Question

I have an input formed by 7 groups, each with 3 values:

[ a0, a1, a2, b0, b1, b2, ..., g0, g1, g2]

3 values are strongly related among themselves and all 7 groups have the same behavior, so each one can be treated the same way.

I would like to create a small neural network to deal with a group information (the 3 values) and replicate this (as seven blocks) to deal with all inputs. So, all these blocks would have the same weights, and each block would be responsible for one group. At the end, the output of these blocks would be reunited and treated by another NN.

I'm asking this because I want to minimize the effort to train the first layers (responsible to treat the input). Taking advantage of the fact that each one of these groups has the same behavior, to train just a piece of it.

What I'm asking for is like a ConvNet kernel. But, a ConvNet kernel would deal with each group of 3 neighbour values, mixing the groups like (a0,a1,a2),(a1,a2,b0),(a2,b0,b1), etc. and creating a bigger output.

I'm beginning with tensorflow, and I have no idea how to create this model. Can you help me to think how to create this?

Hi Emma. Thank you. Actually, I already created a model and it worked. But I have to train models to other cases (like 6x8 and 20x12) where they have similar behavior. So, I was wondering if tensorflow allows me to make this structure. — Filipe Vinadé, Mar 19 '19 at 04:57

score 1 · Answer 1 · answered Mar 28 '19 at 05:54

Your are right, I think ConvNet is the way to go. To avoid "blending" between a(n) and b(n) (in your example), you can use "strides=" as a parameter in the Conv layer. Example :

x = Conv1D(nb_features, kernel_size=(3), strides=(3), padding='valid')(x)

This will have an output with (7*3)/3 = 7 outputs. Choose the nb_features accordingly, to keep quantity of information. For example :

nb_features = 3

could be a starting point.

Filipe Vinadé · Answer 2 · 2019-03-28T05:25:59.647

How I solved it: Variable Sharing

All blocks are using the same set of variables (weights and biases). To do it, all variables must to be named.
With named variables, when a layer is created, all weights can be references to already existing variables.
Finally the blocks outputs can be concatenated and used as input for another network.

This structure has a name: Siamese Neural Network
I didn't know the name, so I didn't find anything about it.

Now I have the name, I found this question: Siamese Neural Network in TensorFlow

How to train a neural network and replicate this as blocks of a bigger network?

2 Answers2