I attempt to train a CNN to binary classify images of the (maybe uncommon) shape of height=2 and width=1000 pixels. My first approach is a small and simple CNN coded as follows:
def cnn_model_01():
model = Sequential()
# Assembly of layers
model.add(Conv2D(16, (2, 2), input_shape=(1, 2, 1000), activation='relu'))
model.add(MaxPooling2D(pool_size=(1, 1)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compilation of model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = cnn_model_01()
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=5, batch_size=200, verbose=2)
The accuracy and prediction of the NN results in a value which reflects simply the distribution of values in the sample. Typical training output is
13s - loss: 0.7772 - acc: 0.5680 - val_loss: 0.6657 - val_acc: 0.6048
Epoch 2/5
15s - loss: 0.6654 - acc: 0.5952 - val_loss: 0.6552 - val_acc: 0.6048
Epoch 3/5
15s - loss: 0.6514 - acc: 0.5952 - val_loss: 0.6396 - val_acc: 0.6048
Epoch 4/5
15s - loss: 0.6294 - acc: 0.5952 - val_loss: 0.6100 - val_acc: 0.6048
Epoch 5/5
13s - loss: 0.5933 - acc: 0.6116 - val_loss: 0.5660 - val_acc: 0.6052
The reason for this is that the NN assigns all input samples to one class. So, in approximately two thirds it is correct by chance in the case of a sample distributed in exactly this way.
In order to fix the problem and get the NN to produce better results I've inspected the output and encountered that the interval or domain of these values is relatively small, e.g. between [0.55;0.62]. I've tried to map resp. resize this interval to [0;1]. As a result a got a really good accuracy of ~99%. I've done this mapping "by hand": subtract the minimum value of the array from each value and divide it by the difference of the maximum and minimum.
Can I implement this mapping in Keras? Is there a layer with this functionality?
Or did I do something completely wrong/not advisable with the layers, which leads to this narrow interval of the output?