0

I want to train a convolutional autoencoder on (pretty noisy) images. For that I build a training_generator and a validation_generator that both produce batches of 120 images. I use 111 batches per epoch for training and 47 batches per epoch for validation. The images have three colors with values from 0 to 255 per channel (color), but the generator turns them into values between 0 and 1 by using images = images.astype('float32') / 255.0. The architecture is

from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model

input_img = Input(shape=(152, 360, 3))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)    #1
x = MaxPooling2D((2, 2), padding='same')(x)                             #2
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)             #3
x = MaxPooling2D((2, 2), padding='same')(x)                             #4
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)             #5
encoded = MaxPooling2D((2, 2), padding='same')(x)                       #6

# at this point the representation is autoencoder.layers[6].output_shape = (None, 19, 45, 8)

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)       #7
x = UpSampling2D((2, 2))(x)                                             #8
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)             #9
x = UpSampling2D((2, 2))(x)                                             #10
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)            #11
x = UpSampling2D((2, 2))(x)                                             #12
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)    #13

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='mean_squared_error')

autoencoder.fit_generator(
    generator=training_generator,
    validation_data=validation_generator,
    use_multiprocessing=True,
    workers=2,
    epochs=2)

I oriented myself according to https://blog.keras.io/building-autoencoders-in-keras.html but changed the loss from "binary_crossentropy" to "mean_squared_error", because the values of the images are not just 0 or 1, as it is the case for the MNIST dataset, but between 0 and 1 (after the mentioned normalization). Here, one can see that I only had two epochs that I trained with, but I still expect some rapid improvement of the loss. However the improvement is not too impressive. How I can I improve the convergence rate of the optimizer during training?

To ask a couple of more specific questions:

  1. Should I use a AveragePooling layer after the input, in order to get reduce the level of the noise?
  2. Should I use some other, additional normalization method?
  3. Should I use sigmoid functions instead of relu for all the layers?

Appendix: Loss development

(I marked with ... where no loss change occurred.)

for optimizer='adadelta'

Epoch 1/2
  1/111 [..............................] - ETA: 42:35 - loss: 0.0212
  2/111 [..............................] - ETA: 33:48 - loss: 0.0210
  3/111 [..............................] - ETA: 30:44 - loss: 0.0220
  4/111 [>.............................] - ETA: 29:02 - loss: 0.0227
  5/111 [>.............................] - ETA: 27:59 - loss: 0.0228
  6/111 [>.............................] - ETA: 27:17 - loss: 0.0225
  7/111 [>.............................] - ETA: 26:37 - loss: 0.0222
  8/111 [=>............................] - ETA: 26:01 - loss: 0.0227
  9/111 [=>............................] - ETA: 25:30 - loss: 0.0229
 10/111 [=>............................] - ETA: 25:08 - loss: 0.0227
 11/111 [=>............................] - ETA: 24:48 - loss: 0.0225
 12/111 [==>...........................] - ETA: 24:33 - loss: 0.0224
 13/111 [==>...........................] - ETA: 24:13 - loss: 0.0223
 14/111 [==>...........................] - ETA: 23:48 - loss: 0.0223
 15/111 [===>..........................] - ETA: 23:31 - loss: 0.0224
 16/111 [===>..........................] - ETA: 23:11 - loss: 0.0223
 17/111 [===>..........................] - ETA: 22:52 - loss: 0.0222
 18/111 [===>..........................] - ETA: 22:33 - loss: 0.0221
 19/111 [====>.........................] - ETA: 22:14 - loss: 0.0221
 20/111 [====>.........................] - ETA: 21:55 - loss: 0.0220
 21/111 [====>.........................] - ETA: 21:39 - loss: 0.0219
 22/111 [====>.........................] - ETA: 21:38 - loss: 0.0218
 23/111 [=====>........................] - ETA: 21:25 - loss: 0.0218
 24/111 [=====>........................] - ETA: 21:08 - loss: 0.0217
 25/111 [=====>........................] - ETA: 20:50 - loss: 0.0219
 26/111 [======>.......................] - ETA: 20:32 - loss: 0.0220
 27/111 [======>.......................] - ETA: 20:15 - loss: 0.0219
 28/111 [======>.......................] - ETA: 19:59 - loss: 0.0220
 29/111 [======>.......................] - ETA: 19:42 - loss: 0.0220
 30/111 [=======>......................] - ETA: 19:25 - loss: 0.0219
 ...
 35/111 [========>.....................] - ETA: 18:10 - loss: 0.0219
 36/111 [========>.....................] - ETA: 17:55 - loss: 0.0220
 ...
 38/111 [=========>....................] - ETA: 17:26 - loss: 0.0220
 39/111 [=========>....................] - ETA: 17:11 - loss: 0.0221
 40/111 [=========>....................] - ETA: 16:56 - loss: 0.0220
 41/111 [==========>...................] - ETA: 16:41 - loss: 0.0221
 42/111 [==========>...................] - ETA: 16:26 - loss: 0.0221
 43/111 [==========>...................] - ETA: 16:11 - loss: 0.0220
 44/111 [==========>...................] - ETA: 15:57 - loss: 0.0220
 45/111 [===========>..................] - ETA: 15:42 - loss: 0.0219
 ...
 50/111 [============>.................] - ETA: 14:30 - loss: 0.0219
 51/111 [============>.................] - ETA: 14:16 - loss: 0.0218
 ...
 54/111 [=============>................] - ETA: 13:34 - loss: 0.0218
 55/111 [=============>................] - ETA: 13:20 - loss: 0.0219
 ...
 57/111 [==============>...............] - ETA: 12:51 - loss: 0.0219
 58/111 [==============>...............] - ETA: 12:37 - loss: 0.0218
 59/111 [==============>...............] - ETA: 12:23 - loss: 0.0219
 60/111 [===============>..............] - ETA: 12:13 - loss: 0.0218
 ...
 62/111 [===============>..............] - ETA: 11:44 - loss: 0.0218
 63/111 [================>.............] - ETA: 11:29 - loss: 0.0217
 ...
 71/111 [==================>...........] - ETA: 9:32 - loss: 0.0217
 72/111 [==================>...........] - ETA: 9:17 - loss: 0.0218
 73/111 [==================>...........] - ETA: 9:03 - loss: 0.0217
 74/111 [===================>..........] - ETA: 8:49 - loss: 0.0218
 ...
 78/111 [====================>.........] - ETA: 7:52 - loss: 0.0218
 79/111 [====================>.........] - ETA: 7:37 - loss: 0.0217
 ...
 86/111 [======================>.......] - ETA: 5:57 - loss: 0.0217
 87/111 [======================>.......] - ETA: 5:42 - loss: 0.0218
 88/111 [======================>.......] - ETA: 5:28 - loss: 0.0217
 ...
 95/111 [========================>.....] - ETA: 3:48 - loss: 0.0217
 96/111 [========================>.....] - ETA: 3:34 - loss: 0.0216
 ...
100/111 [==========================>...] - ETA: 2:38 - loss: 0.0216
101/111 [==========================>...] - ETA: 2:23 - loss: 0.0215
 ...
106/111 [===========================>..] - ETA: 1:11 - loss: 0.0215
107/111 [===========================>..] - ETA: 57s - loss: 0.0214 
 ...
110/111 [============================>.] - ETA: 14s - loss: 0.0214
111/111 [==============================] - 1864s 17s/step - loss: 0.0214 - val_loss: 0.0201
Epoch 2/2
  1/111 [..............................] - ETA: 30:08 - loss: 0.0186
  2/111 [..............................] - ETA: 28:15 - loss: 0.0206
  3/111 [..............................] - ETA: 27:45 - loss: 0.0204
  4/111 [>.............................] - ETA: 27:50 - loss: 0.0208
  5/111 [>.............................] - ETA: 27:44 - loss: 0.0211
  6/111 [>.............................] - ETA: 27:40 - loss: 0.0209
  7/111 [>.............................] - ETA: 27:54 - loss: 0.0207
  8/111 [=>............................] - ETA: 27:36 - loss: 0.0211
  9/111 [=>............................] - ETA: 27:09 - loss: 0.0212
 10/111 [=>............................] - ETA: 26:48 - loss: 0.0210
 11/111 [=>............................] - ETA: 26:24 - loss: 0.0208
 12/111 [==>...........................] - ETA: 26:03 - loss: 0.0210
 13/111 [==>...........................] - ETA: 25:42 - loss: 0.0210
 14/111 [==>...........................] - ETA: 25:33 - loss: 0.0212
 15/111 [===>..........................] - ETA: 25:12 - loss: 0.0211
 16/111 [===>..........................] - ETA: 24:51 - loss: 0.0210
 17/111 [===>..........................] - ETA: 24:30 - loss: 0.0208
 18/111 [===>..........................] - ETA: 24:13 - loss: 0.0207
 19/111 [====>.........................] - ETA: 23:54 - loss: 0.0206
 20/111 [====>.........................] - ETA: 23:39 - loss: 0.0205
 21/111 [====>.........................] - ETA: 23:23 - loss: 0.0206
 22/111 [====>.........................] - ETA: 23:04 - loss: 0.0206
 23/111 [=====>........................] - ETA: 22:49 - loss: 0.0205
 ...
 27/111 [======>.......................] - ETA: 21:53 - loss: 0.0205
 28/111 [======>.......................] - ETA: 21:38 - loss: 0.0206
 29/111 [======>.......................] - ETA: 21:39 - loss: 0.0207
 30/111 [=======>......................] - ETA: 21:27 - loss: 0.0206
 31/111 [=======>......................] - ETA: 21:11 - loss: 0.0206
 32/111 [=======>......................] - ETA: 20:53 - loss: 0.0206
 33/111 [=======>......................] - ETA: 20:36 - loss: 0.0207
 34/111 [========>.....................] - ETA: 20:20 - loss: 0.0206
 35/111 [========>.....................] - ETA: 20:02 - loss: 0.0206
 36/111 [========>.....................] - ETA: 19:45 - loss: 0.0205
 37/111 [=========>....................] - ETA: 19:32 - loss: 0.0205
 38/111 [=========>....................] - ETA: 19:15 - loss: 0.0205
 39/111 [=========>....................] - ETA: 18:58 - loss: 0.0204
 40/111 [=========>....................] - ETA: 18:42 - loss: 0.0205
 41/111 [==========>...................] - ETA: 18:27 - loss: 0.0205
 42/111 [==========>...................] - ETA: 18:10 - loss: 0.0206
 43/111 [==========>...................] - ETA: 17:53 - loss: 0.0206
 44/111 [==========>...................] - ETA: 17:37 - loss: 0.0206
 45/111 [===========>..................] - ETA: 17:21 - loss: 0.0205
 46/111 [===========>..................] - ETA: 17:06 - loss: 0.0205
 47/111 [===========>..................] - ETA: 16:49 - loss: 0.0204
 48/111 [===========>..................] - ETA: 16:32 - loss: 0.0204
 49/111 [============>.................] - ETA: 16:17 - loss: 0.0203
 50/111 [============>.................] - ETA: 16:01 - loss: 0.0204
 51/111 [============>.................] - ETA: 15:45 - loss: 0.0203
 52/111 [=============>................] - ETA: 15:29 - loss: 0.0203
 53/111 [=============>................] - ETA: 15:14 - loss: 0.0203
 54/111 [=============>................] - ETA: 14:58 - loss: 0.0203
 55/111 [=============>................] - ETA: 14:41 - loss: 0.0203
 56/111 [==============>...............] - ETA: 14:25 - loss: 0.0203
 57/111 [==============>...............] - ETA: 14:09 - loss: 0.0203
 58/111 [==============>...............] - ETA: 13:53 - loss: 0.0203
 59/111 [==============>...............] - ETA: 13:37 - loss: 0.0203
 60/111 [===============>..............] - ETA: 13:23 - loss: 0.0203
 61/111 [===============>..............] - ETA: 13:07 - loss: 0.0203
 62/111 [===============>..............] - ETA: 12:50 - loss: 0.0203
 63/111 [================>.............] - ETA: 12:38 - loss: 0.0203
 64/111 [================>.............] - ETA: 12:24 - loss: 0.0202
 65/111 [================>.............] - ETA: 12:08 - loss: 0.0202
 66/111 [================>.............] - ETA: 11:52 - loss: 0.0203
 67/111 [=================>............] - ETA: 11:36 - loss: 0.0203
 68/111 [=================>............] - ETA: 11:21 - loss: 0.0202
 69/111 [=================>............] - ETA: 11:04 - loss: 0.0203
 70/111 [=================>............] - ETA: 10:48 - loss: 0.0204
 71/111 [==================>...........] - ETA: 10:32 - loss: 0.0203
 72/111 [==================>...........] - ETA: 10:16 - loss: 0.0203
 73/111 [==================>...........] - ETA: 9:59 - loss: 0.0203 
 74/111 [===================>..........] - ETA: 9:43 - loss: 0.0203
 75/111 [===================>..........] - ETA: 9:27 - loss: 0.0203
 76/111 [===================>..........] - ETA: 9:11 - loss: 0.0203
 77/111 [===================>..........] - ETA: 8:55 - loss: 0.0203
 78/111 [====================>.........] - ETA: 8:39 - loss: 0.0203
 79/111 [====================>.........] - ETA: 8:23 - loss: 0.0203
 80/111 [====================>.........] - ETA: 8:07 - loss: 0.0203
 81/111 [====================>.........] - ETA: 7:51 - loss: 0.0203
 82/111 [=====================>........] - ETA: 7:35 - loss: 0.0203
 83/111 [=====================>........] - ETA: 7:19 - loss: 0.0203
 84/111 [=====================>........] - ETA: 7:03 - loss: 0.0203
 85/111 [=====================>........] - ETA: 6:47 - loss: 0.0203
 86/111 [======================>.......] - ETA: 6:31 - loss: 0.0203
 87/111 [======================>.......] - ETA: 6:16 - loss: 0.0203
 88/111 [======================>.......] - ETA: 6:00 - loss: 0.0203
 89/111 [=======================>......] - ETA: 5:44 - loss: 0.0203
 90/111 [=======================>......] - ETA: 5:28 - loss: 0.0202
 91/111 [=======================>......] - ETA: 5:12 - loss: 0.0202
 92/111 [=======================>......] - ETA: 4:56 - loss: 0.0202
 93/111 [========================>.....] - ETA: 4:41 - loss: 0.0201
 94/111 [========================>.....] - ETA: 4:25 - loss: 0.0201
 95/111 [========================>.....] - ETA: 4:09 - loss: 0.0201
 96/111 [========================>.....] - ETA: 3:53 - loss: 0.0201
 97/111 [=========================>....] - ETA: 3:38 - loss: 0.0202
 98/111 [=========================>....] - ETA: 3:22 - loss: 0.0202
 99/111 [=========================>....] - ETA: 3:07 - loss: 0.0202
100/111 [==========================>...] - ETA: 2:51 - loss: 0.0202
101/111 [==========================>...] - ETA: 2:35 - loss: 0.0202
102/111 [==========================>...] - ETA: 2:20 - loss: 0.0202
103/111 [==========================>...] - ETA: 2:04 - loss: 0.0203
104/111 [===========================>..] - ETA: 1:48 - loss: 0.0202
105/111 [===========================>..] - ETA: 1:33 - loss: 0.0202
106/111 [===========================>..] - ETA: 1:17 - loss: 0.0202
107/111 [===========================>..] - ETA: 1:02 - loss: 0.0202
108/111 [============================>.] - ETA: 46s - loss: 0.0202 
109/111 [============================>.] - ETA: 31s - loss: 0.0202
110/111 [============================>.] - ETA: 15s - loss: 0.0202
111/111 [==============================] - 1976s 18s/step - loss: 0.0202 - val_loss: 0.0214

for optimizer=keras.optimizers.Adam(lr=1e-2)

Epoch 1/2
  1/111 [..............................] - ETA: 55:34 - loss: 0.0302
  2/111 [..............................] - ETA: 42:45 - loss: 0.0302
  3/111 [..............................] - ETA: 37:07 - loss: 0.0288
  4/111 [>.............................] - ETA: 33:36 - loss: 0.0278
  5/111 [>.............................] - ETA: 31:12 - loss: 0.0270
  6/111 [>.............................] - ETA: 29:47 - loss: 0.0261
  7/111 [>.............................] - ETA: 28:40 - loss: 0.0258
  8/111 [=>............................] - ETA: 27:42 - loss: 0.0257
  9/111 [=>............................] - ETA: 26:53 - loss: 0.0253
 10/111 [=>............................] - ETA: 26:16 - loss: 0.0250
 11/111 [=>............................] - ETA: 25:44 - loss: 0.0246
 12/111 [==>...........................] - ETA: 25:13 - loss: 0.0243
 13/111 [==>...........................] - ETA: 24:45 - loss: 0.0241
 14/111 [==>...........................] - ETA: 24:19 - loss: 0.0239
 15/111 [===>..........................] - ETA: 23:59 - loss: 0.0237
 16/111 [===>..........................] - ETA: 23:33 - loss: 0.0237
 17/111 [===>..........................] - ETA: 23:09 - loss: 0.0235
 18/111 [===>..........................] - ETA: 22:46 - loss: 0.0234
 19/111 [====>.........................] - ETA: 22:26 - loss: 0.0234
 20/111 [====>.........................] - ETA: 22:06 - loss: 0.0232
 21/111 [====>.........................] - ETA: 21:46 - loss: 0.0232
 22/111 [====>.........................] - ETA: 21:27 - loss: 0.0231
 23/111 [=====>........................] - ETA: 21:08 - loss: 0.0230
 24/111 [=====>........................] - ETA: 20:49 - loss: 0.0228
 25/111 [=====>........................] - ETA: 20:31 - loss: 0.0228
 26/111 [======>.......................] - ETA: 20:14 - loss: 0.0228
 27/111 [======>.......................] - ETA: 19:57 - loss: 0.0227
 28/111 [======>.......................] - ETA: 19:40 - loss: 0.0226
 29/111 [======>.......................] - ETA: 19:25 - loss: 0.0225
 30/111 [=======>......................] - ETA: 19:20 - loss: 0.0224
 31/111 [=======>......................] - ETA: 19:07 - loss: 0.0223
 32/111 [=======>......................] - ETA: 18:56 - loss: 0.0223
 33/111 [=======>......................] - ETA: 18:45 - loss: 0.0223
 34/111 [========>.....................] - ETA: 18:31 - loss: 0.0222
 35/111 [========>.....................] - ETA: 18:19 - loss: 0.0223
 36/111 [========>.....................] - ETA: 18:09 - loss: 0.0223
 37/111 [=========>....................] - ETA: 17:57 - loss: 0.0222
 38/111 [=========>....................] - ETA: 17:40 - loss: 0.0222
 39/111 [=========>....................] - ETA: 17:24 - loss: 0.0222
 40/111 [=========>....................] - ETA: 17:09 - loss: 0.0221
 41/111 [==========>...................] - ETA: 16:54 - loss: 0.0221
 42/111 [==========>...................] - ETA: 16:39 - loss: 0.0221
 43/111 [==========>...................] - ETA: 16:28 - loss: 0.0221
 44/111 [==========>...................] - ETA: 16:17 - loss: 0.0221
 45/111 [===========>..................] - ETA: 16:08 - loss: 0.0221
 46/111 [===========>..................] - ETA: 15:57 - loss: 0.0220
 47/111 [===========>..................] - ETA: 15:41 - loss: 0.0221
 48/111 [===========>..................] - ETA: 15:24 - loss: 0.0221
 49/111 [============>.................] - ETA: 15:07 - loss: 0.0220
 50/111 [============>.................] - ETA: 14:50 - loss: 0.0220
 51/111 [============>.................] - ETA: 14:33 - loss: 0.0220
 52/111 [=============>................] - ETA: 14:17 - loss: 0.0219
 53/111 [=============>................] - ETA: 14:01 - loss: 0.0219
 54/111 [=============>................] - ETA: 13:45 - loss: 0.0218
 55/111 [=============>................] - ETA: 13:31 - loss: 0.0218
 56/111 [==============>...............] - ETA: 13:16 - loss: 0.0218
 57/111 [==============>...............] - ETA: 13:02 - loss: 0.0218
 58/111 [==============>...............] - ETA: 12:46 - loss: 0.0217
 59/111 [==============>...............] - ETA: 12:30 - loss: 0.0217
 60/111 [===============>..............] - ETA: 12:14 - loss: 0.0217
 61/111 [===============>..............] - ETA: 11:58 - loss: 0.0216
 62/111 [===============>..............] - ETA: 11:42 - loss: 0.0216
 63/111 [================>.............] - ETA: 11:27 - loss: 0.0215
 64/111 [================>.............] - ETA: 11:11 - loss: 0.0215
 65/111 [================>.............] - ETA: 10:55 - loss: 0.0214
 66/111 [================>.............] - ETA: 10:40 - loss: 0.0214
 67/111 [=================>............] - ETA: 10:25 - loss: 0.0213
 68/111 [=================>............] - ETA: 10:13 - loss: 0.0213
 69/111 [=================>............] - ETA: 9:58 - loss: 0.0213 
 70/111 [=================>............] - ETA: 9:43 - loss: 0.0212
 ...
 72/111 [==================>...........] - ETA: 9:14 - loss: 0.0212
 73/111 [==================>...........] - ETA: 8:59 - loss: 0.0211
 ...
 76/111 [===================>..........] - ETA: 8:16 - loss: 0.0211
 77/111 [===================>..........] - ETA: 8:01 - loss: 0.0210
 78/111 [====================>.........] - ETA: 7:46 - loss: 0.0211
 79/111 [====================>.........] - ETA: 7:32 - loss: 0.0210
 80/111 [====================>.........] - ETA: 7:17 - loss: 0.0210
 81/111 [====================>.........] - ETA: 7:03 - loss: 0.0209
 82/111 [=====================>........] - ETA: 6:49 - loss: 0.0209
 83/111 [=====================>........] - ETA: 6:35 - loss: 0.0208
 84/111 [=====================>........] - ETA: 6:21 - loss: 0.0208
 85/111 [=====================>........] - ETA: 6:07 - loss: 0.0207
 ...
 89/111 [=======================>......] - ETA: 5:11 - loss: 0.0207
 90/111 [=======================>......] - ETA: 4:56 - loss: 0.0206
 ...
 92/111 [=======================>......] - ETA: 4:28 - loss: 0.0206
 93/111 [========================>.....] - ETA: 4:14 - loss: 0.0205
 94/111 [========================>.....] - ETA: 4:00 - loss: 0.0205
 95/111 [========================>.....] - ETA: 3:46 - loss: 0.0204
 ...
 98/111 [=========================>....] - ETA: 3:03 - loss: 0.0204
 99/111 [=========================>....] - ETA: 2:49 - loss: 0.0203
100/111 [==========================>...] - ETA: 2:35 - loss: 0.0204
101/111 [==========================>...] - ETA: 2:21 - loss: 0.0204
102/111 [==========================>...] - ETA: 2:07 - loss: 0.0204
103/111 [==========================>...] - ETA: 1:53 - loss: 0.0203
104/111 [===========================>..] - ETA: 1:38 - loss: 0.0203
105/111 [===========================>..] - ETA: 1:24 - loss: 0.0203
106/111 [===========================>..] - ETA: 1:10 - loss: 0.0203
107/111 [===========================>..] - ETA: 56s - loss: 0.0202 
 ...
110/111 [============================>.] - ETA: 14s - loss: 0.0202
111/111 [==============================] - 1823s 16s/step - loss: 0.0202 - val_loss: 0.0186

Epoch 2/2
  1/111 [..............................] - ETA: 36:31 - loss: 0.0192
  2/111 [..............................] - ETA: 32:08 - loss: 0.0182
  3/111 [..............................] - ETA: 30:47 - loss: 0.0177
  4/111 [>.............................] - ETA: 29:42 - loss: 0.0176
  5/111 [>.............................] - ETA: 28:54 - loss: 0.0175
  6/111 [>.............................] - ETA: 28:15 - loss: 0.0180
  7/111 [>.............................] - ETA: 27:47 - loss: 0.0180
  8/111 [=>............................] - ETA: 27:29 - loss: 0.0178
  9/111 [=>............................] - ETA: 27:04 - loss: 0.0178
 10/111 [=>............................] - ETA: 26:44 - loss: 0.0178
 11/111 [=>............................] - ETA: 26:27 - loss: 0.0180
 12/111 [==>...........................] - ETA: 26:08 - loss: 0.0182
 13/111 [==>...........................] - ETA: 25:50 - loss: 0.0182
 14/111 [==>...........................] - ETA: 25:27 - loss: 0.0181
 15/111 [===>..........................] - ETA: 25:15 - loss: 0.0180
 16/111 [===>..........................] - ETA: 24:55 - loss: 0.0183
 17/111 [===>..........................] - ETA: 24:37 - loss: 0.0184
 18/111 [===>..........................] - ETA: 24:19 - loss: 0.0184
 19/111 [====>.........................] - ETA: 24:00 - loss: 0.0183
 20/111 [====>.........................] - ETA: 23:42 - loss: 0.0183
 21/111 [====>.........................] - ETA: 23:23 - loss: 0.0183
 22/111 [====>.........................] - ETA: 23:06 - loss: 0.0182
 23/111 [=====>........................] - ETA: 22:49 - loss: 0.0182
 24/111 [=====>........................] - ETA: 22:32 - loss: 0.0183
 25/111 [=====>........................] - ETA: 22:11 - loss: 0.0184
 26/111 [======>.......................] - ETA: 21:53 - loss: 0.0183
 27/111 [======>.......................] - ETA: 21:41 - loss: 0.0184
 28/111 [======>.......................] - ETA: 21:23 - loss: 0.0184
 29/111 [======>.......................] - ETA: 21:12 - loss: 0.0184
 30/111 [=======>......................] - ETA: 20:58 - loss: 0.0185
 31/111 [=======>......................] - ETA: 20:41 - loss: 0.0185
 ...
 33/111 [=======>......................] - ETA: 20:13 - loss: 0.0185
 34/111 [========>.....................] - ETA: 20:00 - loss: 0.0186
 35/111 [========>.....................] - ETA: 19:44 - loss: 0.0187
 36/111 [========>.....................] - ETA: 19:25 - loss: 0.0186
 37/111 [=========>....................] - ETA: 19:09 - loss: 0.0187
 ...
 40/111 [=========>....................] - ETA: 18:23 - loss: 0.0187
 41/111 [==========>...................] - ETA: 18:11 - loss: 0.0186
 ...
 43/111 [==========>...................] - ETA: 17:53 - loss: 0.0186
 44/111 [==========>...................] - ETA: 17:44 - loss: 0.0185
 ...
 48/111 [===========>..................] - ETA: 16:48 - loss: 0.0185
 49/111 [============>.................] - ETA: 16:37 - loss: 0.0184
 ...
 58/111 [==============>...............] - ETA: 14:27 - loss: 0.0184
 59/111 [==============>...............] - ETA: 14:10 - loss: 0.0183
 60/111 [===============>..............] - ETA: 13:54 - loss: 0.0183
 61/111 [===============>..............] - ETA: 13:37 - loss: 0.0183
 62/111 [===============>..............] - ETA: 13:20 - loss: 0.0182
 63/111 [================>.............] - ETA: 13:02 - loss: 0.0183
 ...
 67/111 [=================>............] - ETA: 11:51 - loss: 0.0183
 68/111 [=================>............] - ETA: 11:35 - loss: 0.0182
 ...
 75/111 [===================>..........] - ETA: 9:36 - loss: 0.0182
 76/111 [===================>..........] - ETA: 9:19 - loss: 0.0181
 77/111 [===================>..........] - ETA: 9:02 - loss: 0.0182
 78/111 [====================>.........] - ETA: 8:46 - loss: 0.0182
 79/111 [====================>.........] - ETA: 8:30 - loss: 0.0181
 ...
 82/111 [=====================>........] - ETA: 7:40 - loss: 0.0181
 83/111 [=====================>........] - ETA: 7:24 - loss: 0.0180
 84/111 [=====================>........] - ETA: 7:08 - loss: 0.0181
 85/111 [=====================>........] - ETA: 6:52 - loss: 0.0180
 ...
 99/111 [=========================>....] - ETA: 3:09 - loss: 0.0180
100/111 [==========================>...] - ETA: 2:53 - loss: 0.0179
101/111 [==========================>...] - ETA: 2:38 - loss: 0.0180
102/111 [==========================>...] - ETA: 2:22 - loss: 0.0179
103/111 [==========================>...] - ETA: 2:06 - loss: 0.0180
 ...
110/111 [============================>.] - ETA: 15s - loss: 0.0180
Make42
  • 12,236
  • 24
  • 79
  • 155
  • For how many epochs have you trained the network? Have you tried changing the optimizer to Adam or RMSprop? Or increasing the learning rate like: `optimizer=optimizers.Adam(lr=1e-2)`? – today Sep 16 '18 at 20:24
  • @today: Only 2 epochs, but I would expect some significant changes in loss at the beginning of training. No I have not tried other optimizer, as this was the one given by the keras example, and I did not think the issue would be with the optimizer type or its settings, but with the layers (see my updated, more specific questions). – Make42 Sep 16 '18 at 20:27
  • Well, I suggest to tune the optimizer first because there is nothing wrong with the layers or normalization scheme you have used. Optimizer is a component of the network the same way as layers are, so it needs tuning as well. If couldn't get anywhere with modifying optimizer settings then go back to layers and other hyperparameters of network. – today Sep 16 '18 at 20:35
  • @today: and Adam with learning rate 1e-2 is your recommendation for this case? – Make42 Sep 16 '18 at 20:43
  • If you see that the loss value is decreasing but the rate of going down is very slow, then one (possible) solution is to increase the learning rate (cautiously). As for the optimizer anything with an adaptive learning rate may work fine: Adam, RMSprop, Adadelta, etc. At the end, you must experiment as there is no specific and definite answer. – today Sep 16 '18 at 20:50
  • @today: I added the loss development for adadelta. – Make42 Sep 16 '18 at 21:19
  • Is this after increasing the learning rate? – today Sep 17 '18 at 07:26
  • @today: I added clarification. – Make42 Sep 17 '18 at 10:23
  • So I guess increasing the learning rate helps but you must train for much more than 2 epochs (and monitor val loss to prevent overfitting). BTW, why does it take so much time to train? Are you training on CPU? – today Sep 17 '18 at 10:37
  • Further, note that the it does not matter the loss increases or is fixed between batches; rather, what matters is the loss at the end of each epoch (and also the val loss). – today Sep 17 '18 at 10:39
  • @today: Yes, I train on a CPU of a Lenovo Thinkpad T530 with an i5 that came out 2012... – Make42 Sep 17 '18 at 10:56
  • @today: Can I not tell keras to do cross validation and stop training, when the validatio loss starts to get bigger again (which would indicate overfitting)? – Make42 Sep 17 '18 at 10:57
  • Of course you can. You can use [EarlyStopping](https://keras.io/callbacks/#earlystopping) callback to stop training when val loss starts increasing. Further, you can save the model using [ModelCheckpoint](https://keras.io/callbacks/#modelcheckpoint) callback at the end of each epoch (you can configure it to only save the best model so far). – today Sep 17 '18 at 11:24

0 Answers0