25

I'm trying to implement a Siamese Neural Network in TensorFlow but I cannot really find any working example on the Internet (see Yann LeCun paper).

enter image description here

The architecture I'm trying to build would consist of two LSTMs sharing weights and only connected at the end of the network.

My question is: how to build two different neural networks sharing their weights (tied weights) in TensorFlow and how to connect them at the end?

Thanks :)

Edit: I implemented a simple and working example of a siamese network here on MNIST.

BiBi
  • 7,418
  • 5
  • 43
  • 69

1 Answers1

17

Update with tf.layers

If you use the tf.layers module to build your network, you can simply use the argument reuse=True for the second part of the Siamese network:

x = tf.ones((1, 3))
y1 = tf.layers.dense(x, 4, name='h1')
y2 = tf.layers.dense(x, 4, name='h1', reuse=True)

# y1 and y2 will evaluate to the same values
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print(sess.run(y1))
print(sess.run(y2))  # both prints will return the same values

Old answer with tf.get_variable

You can try using the function tf.get_variable(). (See the tutorial)

Implement the first network using a variable scope with reuse=False:

with tf.variable_scope('Inference', reuse=False):
    weights_1 = tf.get_variable('weights', shape=[1, 1],
                              initializer=...)
    output_1 = weights_1 * input_1

Then implement the second with the same code except using reuse=True

with tf.variable_scope('Inference', reuse=True):
    weights_2 = tf.get_variable('weights')
    output_2 = weights_2 * input_2

The first implementation will create and initialize every variable of the LSTM, whereas the second implementation will use tf.get_variable() to get the same variables used in the first network. That way, variables will be shared.

Then you just have to use whatever loss you want (e.g. you can use the L2 distance between the two siamese networks), and the gradients will backpropagate through both networks, updating the shared variables with the sum of the gradients.

Olivier Moindrot
  • 27,908
  • 11
  • 92
  • 91
  • 3
    You can also define once all the variables, like `weights = tf.Variable(...)`, and then use these variables in each inference, `output_1 = weights * input_1` and `output_2 = weights * input_2`. As with shared variables, here the variable `weights` will receive two gradients and two gradient updates. – Olivier Moindrot Apr 26 '16 at 14:02
  • I've a doubt, is it necessary to use tf.get_variable()? Can we directly use tf.conv2d() without creating a variable using tf.get_variable() at all? – kunal18 Jan 15 '18 at 16:07
  • @kunal18 : I added an example with `tf.layers` – Olivier Moindrot Jan 15 '18 at 19:31
  • Thanks for the update! Can you please look at my question here: https://stackoverflow.com/questions/48266886/defining-a-siamese-network-in-tensorflow – kunal18 Jan 15 '18 at 19:40