I have a dataset with images containing one/two/three/... cards. Since in total I have 52 different cards, I have 52 classes -> thus I have 52 neurons in my output layer.
Training the network with one card per image works well with CNN. One label would look like this: [0,0,...,1,0,0] for example. This is the last layer of my network for this task:
model.add(layers.Dense(52, activation='softmax'))
optimizer = keras.optimizers.Adam(lr=0.00001)
model.compile(loss='categorical_crossentropy',metrics=['accuracy'],optimizer=optimizer)
Training my network for two or more cards per image is more challenging for me.
Since one image contains now more than one card, a possible label for this image would look like: [0,1,0,...,1,0,0].
I would start with the same network architecture, but:
I think for this problem I have to use now sigmoid instead of softmax (since each class is independent) in the last layer.
For the loss I would simply use something like mse = tf.keras.losses.MeanSquaredError()
For the accuracy I am not sure.
model.add(layers.Dense(52, activation='sigmoid'))
adam = keras.optimizers.Adam(lr=0.00001)
model.compile(loss=mse ,metrics=['__?__'],optimizer=adam)
How wrong am I with these settings?
I searched a lot - but confusingly I am not finding some helpful comments. People give always some hints as using YOLO - but I wont detect objects - I only want to classify: In the picture there is a ace of hearts and a king of hearts for example - where they are doesnt matter.
One more confusion: I red several times that CNN can only classify single class problems - is that true? I hope not - but if it is, why and how can I still solve my problem using keras?
Here is the total network:
model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='relu',input_shape=(500, 500, 3)))
model.add(BatchNormalization())
model.add(layers.MaxPooling2D((4, 4)))
model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPooling2D((4, 4)))
model.add(BatchNormalization())
model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPooling2D((3, 3)))
model.add(BatchNormalization())
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(52, activation='softmax'))