0

I have a dataset with images containing one/two/three/... cards. Since in total I have 52 different cards, I have 52 classes -> thus I have 52 neurons in my output layer.

Training the network with one card per image works well with CNN. One label would look like this: [0,0,...,1,0,0] for example. This is the last layer of my network for this task:

model.add(layers.Dense(52, activation='softmax'))
optimizer = keras.optimizers.Adam(lr=0.00001) 
model.compile(loss='categorical_crossentropy',metrics=['accuracy'],optimizer=optimizer)

Training my network for two or more cards per image is more challenging for me. Since one image contains now more than one card, a possible label for this image would look like: [0,1,0,...,1,0,0]. I would start with the same network architecture, but: I think for this problem I have to use now sigmoid instead of softmax (since each class is independent) in the last layer. For the loss I would simply use something like mse = tf.keras.losses.MeanSquaredError() For the accuracy I am not sure.

model.add(layers.Dense(52, activation='sigmoid'))
adam = keras.optimizers.Adam(lr=0.00001) 
model.compile(loss=mse ,metrics=['__?__'],optimizer=adam)

How wrong am I with these settings?

I searched a lot - but confusingly I am not finding some helpful comments. People give always some hints as using YOLO - but I wont detect objects - I only want to classify: In the picture there is a ace of hearts and a king of hearts for example - where they are doesnt matter.

One more confusion: I red several times that CNN can only classify single class problems - is that true? I hope not - but if it is, why and how can I still solve my problem using keras?

Here is the total network:

model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='relu',input_shape=(500, 500, 3)))
model.add(BatchNormalization())
model.add(layers.MaxPooling2D((4, 4)))
model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPooling2D((4, 4)))
model.add(BatchNormalization())
model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPooling2D((3, 3)))
model.add(BatchNormalization())
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(BatchNormalization())

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(52, activation='softmax'))

nuemlouno
  • 288
  • 1
  • 10

1 Answers1

0

I red several times that CNN can only classify single class problems

That's false. With a CNN you can train a binary classification problem, a multiclass problem and also a multilabel problem. Actually a multilabel problem is what you are looking for.

In a multilabel classification problem you could use [0,1,0,...,1,0,0] as a target output. So for one single input sample multiple classes could be true at the same time! The output of a well trained network in this case could be [0.01, 0.99, 0.001, ..., 0.89, 0.001, 0.0001]. So you can use multiple independent binary classifications in one single network.

I will link another very similar question that I answered in more detail. I already addressed the specific metric, activation and loss function which you could use:

multilabel classification

1994
  • 71
  • 8