1

I'm training autoencoders on 2D images using convolutional layers and would like to put fully connected layers on top of encoder part for classification. My autoencoder is defined as follows (just a simple one for illustration):

def encoder(input_img):
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
    conv1 = BatchNormalization()(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
    conv2 = BatchNormalization()(conv2)
    return conv2

def decoder(conv2):    
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv2)
    conv3 = BatchNormalization()(conv3)
    up1 = UpSampling2D((2,2))(conv3)
    decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up1)
    return decoded

autoencoder = Model(input_img, decoder(encoder(input_img)))

My input images are of size (64,80,1). Now when stacking fully connected layers on top of the encoder I'm doing the following:

def fc(enco):
    flat = Flatten()(enco)
    den = Dense(128, activation='relu')(flat)
    out = Dense(num_classes, activation='softmax')(den)
    return out

encode = encoder(input_img)
full_model = Model(input_img,fc(encode))
for l1,l2 in zip(full_model.layers[:19],autoencoder.layers[0:19]):
    l1.set_weights(l2.get_weights())

For only one autoencoder this works but the problem now is that I have 2 autoencoders trained on sets of images all of size (64, 80, 1).

For every label I have as input two images of size (64, 80, 1) and one label (0 or 1). I need to feed image 1 into the first autoencoder and image 2 into the second autoencoder. But how can I combine both autoencoders in the full_model in above code?

Another problem is also the input to the fit() method. Until now with only one autoencoder the input consisted just of numpy arrays of images (e.g. (1000,64,80,1)) but with two autoencoders I would have two sets of images as input. How can I feed this into the fit() method so that the first autoencoder consumes the first set of images and the second autoencoder the second set?

machinery
  • 5,972
  • 12
  • 67
  • 118

1 Answers1

2

Q: How can I combine both autoencoders in full_model?

A: You could concatenate the bottleneck layers enco_1 and enco_2 of both autoencoders within fc:

def fc(enco_1, enco_2):
    flat_1 = Flatten()(enco_1)
    flat_2 = Flatten()(enco_2)
    flat = Concatenate()([enco_1, enco_2])
    den = Dense(128, activation='relu')(flat)
    out = Dense(num_classes, activation='softmax')(den)
    return out

encode_1 = encoder_1(input_img_1)
encode_2 = encoder_2(input_img_2)

full_model = Model([input_img_1, input_img_2], fc(encode_1, encode_2))

Note that the last part where you manually set the weights of the encoder is unnecessary - see https://keras.io/getting-started/functional-api-guide/#shared-layers


Q: How can I feed this into the fit method so that the first autoencoder consumes the first set of images and the second autoencoder the second set?

A: In the code above, note that the two encoders are fed with different inputs (one for each image set). Now, provided that the model is defined in this way, you can call full_model.fit as follows:

full_model.fit(x=[images_set_1, images_set_2],
               y=label,
               ...)

NOTE: Not tested.

rvinas
  • 11,824
  • 36
  • 58
  • Thank you, I will give it a try. Do you know if by default all layers in the full_model are set to trainable? – machinery Aug 09 '19 at 13:19
  • You're welcome. Yes, they are all trainable by default. – rvinas Aug 09 '19 at 13:51
  • By the way, Let's say I have two autoencoders with latent dimension 16 each (i.e. 16 dimensional vector), i.e. the input to the fully connected network will be 32 dimensional. In total I have around 1000 labels and two classes. How many layers of the fully connected network and how many nodes per layer would you choose to start with? If I use more than one layer should the number of nodes be increasing or decreasing from one layer to next (e.g. multiple of two)? – machinery Aug 14 '19 at 15:52
  • That's hard to tell, it really depends on the problem. I would tune all these hyperparameters using a validation set – rvinas Aug 14 '19 at 22:06
  • Alright, thanks. What is the usual layout for such a fully connected network? That means are the number of nodes per layer increasing or decreasing (e.g. by a factor of two) or staying the same from layer to layer? I think I would need to know a certain range of the parameters when tuning with a validation set (e.g. number of nodes, number of layers). – machinery Aug 15 '19 at 10:50
  • I would say that there is no "usual layout" and both options are possible - it really depends on the problem. That's the alchemy of deep learning. I once asked Ian Goodfellow about his favourite approach to hyperparameter optimization and he suggested an iterative algorithm consisting of two steps: random search followed by tightening the random distributions. You can find the answer [here](https://www.quora.com/Whats-Ian-Goodfellows-favourite-approach-to-hyperparameter-optimization). – rvinas Aug 15 '19 at 11:27
  • Thanks a lot. I will give it a try. It was an interesting read. Still, the initial search range must be determined (searching 1-4 layers or 1-20 layers...) – machinery Aug 27 '19 at 14:53
  • In your code above should it be flat = Concatenate()([flat1, flat2]) in the method? Or should I first concatenate and then flatten? – machinery Aug 27 '19 at 14:54
  • It shouldn't make a difference. I tend to prefer flattening and then concatenating, it looks clearer to me. – rvinas Aug 27 '19 at 17:05