Very briefly my question relates to image-size not remaining the same as the input image size after a maxpool layer when I use padding = 'same'
in Keras code. I am going through the Keras blog: Building Autoencoders in Keras. I am building Convolution autoencoder. The autoencoder code is as follows:
input_layer = Input(shape=(28, 28, 1))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
As per autoencoder.summary()
, the image output after the very-first Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
layer is 28 X 28 X 16 ie the same as input image size. This is because padding is 'same'
.
In [49]: autoencoder.summary() (Numbering of layers is given by me and not produced in output) _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= 1.input_1 (InputLayer) (None, 28, 28, 1) 0 _________________________________________________________________ 2.conv2d_1 (Conv2D) (None, 28, 28, 16) 160 _________________________________________________________________ 3.max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16) 0 _________________________________________________________________ 4.conv2d_2 (Conv2D) (None, 14, 14, 8) 1160 _________________________________________________________________ 5.max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8) 0 _________________________________________________________________ 6.conv2d_3 (Conv2D) (None, 7, 7, 8) 584 _________________________________________________________________ 7.max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8) 0 _________________________________________________________________ 8.conv2d_4 (Conv2D) (None, 4, 4, 8) 584 _________________________________________________________________ 9.up_sampling2d_1 (UpSampling2 (None, 8, 8, 8) 0 _________________________________________________________________ 10.conv2d_5 (Conv2D) (None, 8, 8, 8) 584 _________________________________________________________________ 11.up_sampling2d_2 (UpSampling2 (None, 16, 16, 8) 0 _________________________________________________________________ 12.conv2d_6 (Conv2D) (None, 14, 14, 16) 1168 _________________________________________________________________ 13.up_sampling2d_3 (UpSampling2 (None, 28, 28, 16) 0 _________________________________________________________________ 14.conv2d_7 (Conv2D) (None, 28, 28, 1) 145 =================================================================
Next layer (layer 3) is, MaxPooling2D((2, 2), padding='same')(x)
. The summary() shows the output image size of this layer as, 14 X 14 X 16. But padding in this layer is also 'same'
. So how come output image-size does not remain as 28 X 28 X 16 with padded zeros?
Also, it is not clear as to how the output shape has changed to (14 X 14 X 16) after layer 12, when input shape coming from above its earlier layer is (16 X 16 X 8).
`