I want to implement a Siamese Convolutional Neural Network, where two images share weights in the convolutional layers, and are then concatenated before being passed through the fully-connected layers. I have tried an implementation, but it seems rather a "hacked" solution. In particular, I have defined an operation on tensors as simply a Python function, and I'm not sure whether this is allowed.
Here is the code I have tried:
images = tf.placeholder(tf.float32, shape=[None, 64 * 64])
# Convolutional layers
# ...
# ...
# Results in pool3_flat, which is the flattened output of the third convolutional layer
pool3_flat = tf.reshape(pool3, [-1, 8 * 8 * 128])
# Now, merge the image pairs, where each pair is composed of adjacent images in the batch, with a stride of 2
def merge_pairs():
# Create a tensor to store the merged image pairs
# The batch size is 128, therefore there will be 64 pairs (64 in the first dimension of this tensor)
merged_pairs = tf.Variable(tf.zeros([64, 8 * 8 * 128]))
# Split the images into 64 pairs
pairs = tf.split(0, 64, pool3_flat)
# For each pair, concatenate the two images across dimension 1, and set this tensor in the appropriate row of merged_pairs
for pair_num, pair in enumerate(pairs):
merged_pair = tf.concat(1, pair)
merged_pairs[pair_num] = merged_pair
return merged_pairs
# Proceed with operations on the merged_pair tensor, as if the batch size is 64
fc4 = tf.matmul(merge_pairs(), weights4)
# ...
# ...
Whilst this compiles and seems to run fine, the results are not really as expected. So, I'm wondering if there is a better way to implement a Siamese network using built-in operations in TensorFlow?