1

I am working on an autoencoder and I have an issue with reproducing the input in the same size. If I am using transposed convolution / deconvolution operation with the same parameters, I got a different output size then the original input was. For illustrating my problem, let us assume our model consists of just one convlution (to encode the input) and one deconvolution (to decode the encoded input). However, I not get the same size as my input. More precisely, the second and third dimension / axis 1 and axis 2 are 16 and not as one would expect: 15. Here is the code:

    import tensorflow as tf
    input = tf.keras.Input(shape=(15, 15, 3), name="Input0")

        conv2d_layer2 = tf.keras.layers.Conv2D(filters=32, strides=[2, 2], kernel_size=[3, 3],
                                              padding='same',
                                              activation='selu', name="Conv1")

        conv2d_trans_layer2 = tf.keras.layers.Conv2DTranspose(filters=32, strides=[2, 2],
                                                             kernel_size=[3, 3], padding='same',
                                                             activation='selu', name="DeConv1")

        x_endcoded_1 = conv2d_layer2(input)
        x_reconstructed = conv2d_trans_layer2(x_endcoded_1)
        model = tf.keras.Model(inputs=input, outputs=x_reconstructed)

Results in the following model:

Use tf.where in 2.0, which has the same broadcast rule as np.where
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Input0 (InputLayer)          [(None, 15, 15, 3)]       0         
_________________________________________________________________
Conv1 (Conv2D)               (None, 8, 8, 32)          896       
_________________________________________________________________
DeConv1 (Conv2DTranspose)    (None, 16, 16, 32)        9248      
=================================================================
Total params: 10,144
Trainable params: 10,144

How can I reproduce my original input with using just this tranposed convolution? Is this possible?

user3352632
  • 617
  • 6
  • 18

1 Answers1

1

deleting padding from both you can reproduce the mapping

input = Input(shape=(15, 15, 3), name="Input0")

conv2d_layer2 = Conv2D(filters=32, strides=[2, 2], kernel_size=[3, 3],
                      activation='selu', name="Conv1")(input)

conv2d_trans_layer2 = Conv2DTranspose(filters=32, strides=[2, 2],
                                     kernel_size=[3, 3], 
                                     activation='selu', name="DeConv1")(conv2d_layer2)

model = Model(inputs=input, outputs=conv2d_trans_layer2)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Input0 (InputLayer)          [(None, 15, 15, 3)]       0         
_________________________________________________________________
Conv1 (Conv2D)               (None, 7, 7, 32)          896       
_________________________________________________________________
DeConv1 (Conv2DTranspose)    (None, 15, 15, 32)        9248      
=================================================================

In general, to do this in deeper structures you have to play with padding, strides and pooling

online there are a lot of good resources that explain how this operation works and their application in keras

Padding and Stride for Convolutional Neural Networks

Pooling Layers for Convolutional Neural Networks

How to use the UpSampling2D and Conv2DTranspose

Marco Cerliani
  • 21,233
  • 3
  • 49
  • 54